Thursday, May 28, 2015
Terminating the Twenties
I'm still working on the 1929 Wallulah, however I only have 20 pages left to process so I should finish early tomorrow and then I'll start on the 1930s. Thankfully, the type has gotten less decorative as the book progresses so the software reads text much more accurately and makes the process go a lot quicker.
Wednesday, May 27, 2015
May 27
Today I continued working on processing the text on the 1929 Wallulah pages and made considerable progress. It's going a lot quicker now that I've got the hang of it and the pages aren't as text-heavy as the student pages. Hopefully, I'll be able to finish 1929 by the end of this week.
Tuesday, May 26, 2015
May 26
Today I continued working on the 1929 Wallulah and was able to make quite a bit of progress. After I completed the student name pages with lots of text, the process went much quicker.
However, I did come across a couple problems near the end of the day. I noticed one of the pages looks skewed in the Verification Station but when I opened the .jpg file on Drobo it looks straight.
Also, while searching for this file, I noticed that the file names are 1+ on ABBYY, for example 088.jpg on ABBYY is 087.jpg on Drobo. I'm not sure what caused this increase in file numbering and I'm not sure how to go about finding the problem or correcting it but I'll ask tomorrow.
However, I did come across a couple problems near the end of the day. I noticed one of the pages looks skewed in the Verification Station but when I opened the .jpg file on Drobo it looks straight.
Also, while searching for this file, I noticed that the file names are 1+ on ABBYY, for example 088.jpg on ABBYY is 087.jpg on Drobo. I'm not sure what caused this increase in file numbering and I'm not sure how to go about finding the problem or correcting it but I'll ask tomorrow.
Friday, May 22, 2015
Sneaking in a post
Hello it's Bronte!
Just wanted to leave an update on where I am:
Currently working on the digital exhibit for the Sackett Collection. I have all the images I want to use downloaded. There are way too many I want in the exhibit right now, but I'll try to keep it to a couple dozen or so. I tried to get some of them on IntelliJ but I don't think I did it right? I'll definitely need some help with it on Tuesday.
Have a great weekend.
Bronte
Just wanted to leave an update on where I am:
Currently working on the digital exhibit for the Sackett Collection. I have all the images I want to use downloaded. There are way too many I want in the exhibit right now, but I'll try to keep it to a couple dozen or so. I tried to get some of them on IntelliJ but I don't think I did it right? I'll definitely need some help with it on Tuesday.
Have a great weekend.
Bronte
Wednesday, May 20, 2015
Working through Wallulah
Today I resumed processing the text on the 1929 Wallulah pages. I was able to clarify all the steps with Sara and make some progress through the first portion of the book. However, the verification software has a fair amount of trouble reading decorative fonts and very small type size so I'm finding I have to re-type in several of the text boxes. The student pages in particular take a little more time since each student has a list of all their activities throughout college so the type is especially small and unrecognized by ABBYY. Otherwise, I'm making progress and continue to work on this tomorrow.
Tuesday, May 19, 2015
Day Two
Today I finished putting all the Blain Diary transcriptions into individual text files and was able to scan the few pages of the diary I had missed originally.
Following this, I started on identifying the text on scanned Wallulah pages. I imported all the images from 1929 Wallulah into the ScanStation and proceeded to put them through the VerificationStation. However, when I transfer them to the VerificationStation they end up out of order. I was able to do the first 6 pages but didn't get much further. I also tried to accept documents through the IndexingStation but the pages didn't appear in the proper Output folder so I'm very confused where they ended up. I'll ask Sara about all these issues tomorrow morning when I begin work.
Following this, I started on identifying the text on scanned Wallulah pages. I imported all the images from 1929 Wallulah into the ScanStation and proceeded to put them through the VerificationStation. However, when I transfer them to the VerificationStation they end up out of order. I was able to do the first 6 pages but didn't get much further. I also tried to accept documents through the IndexingStation but the pages didn't appear in the proper Output folder so I'm very confused where they ended up. I'll ask Sara about all these issues tomorrow morning when I begin work.
Monday, May 18, 2015
Beginning Blain
Today was my first day working as a summer intern in the Digital Kingdom.
I started by dividing the entire transcription of the Blain Diary into separate text documents that match up with each scanned page. We decided not to include the transcriber's notes but rather provide a link to the full document with the transcription and notes. I've completed this process up to approx. mid-September in the diary so I think I'll be able to finish this tomorrow.
I also discovered that I had originally missed scanning a few pages in the diary and this resulted in some major re-numbering of the .tif and .jpg file names. The .tif file names are now up-to-date, however I still need to revise the .jpg file names, which I should also be able to complete tomorrow.
I started by dividing the entire transcription of the Blain Diary into separate text documents that match up with each scanned page. We decided not to include the transcriber's notes but rather provide a link to the full document with the transcription and notes. I've completed this process up to approx. mid-September in the diary so I think I'll be able to finish this tomorrow.
I also discovered that I had originally missed scanning a few pages in the diary and this resulted in some major re-numbering of the .tif and .jpg file names. The .tif file names are now up-to-date, however I still need to revise the .jpg file names, which I should also be able to complete tomorrow.
Subscribe to:
Comments (Atom)