ocr and powerpoint

January 30, 2008 4 comments

One of the projects I’m working on requires text being read off of a powerpoint slide, or any type of presentation materials for that matter. In a preivous post I released a perl script that can parse the text out of a native .ppt file using some clunky ole automation. But in this new scenario, the PPT is recorded as a JPEG image. I’m probably the last one on the planet to find GOCR, the open source OCR program. There is even a windows binary that you can download. In order to replicate the problem I’m trying to solve, I do the following:

  1. Save the .ppt deck as .jpg – this feature will save all the slides or the current slide as jpg files
  2. Next you need to transform the image to greyscale, because of 2 issues with GOCR, it works best with greyscale images that are in the .pnm format. – for this step you’ll need to download djpeg.exe
  3. Then run the following:
    > djpeg -grey -pnm test.jpg test.pnm
    > gocr test.pnm
    and that’s it!

Have a look at this site :

This is a great example where gocr and djpeg are being used.

html list for company types

January 10, 2008 2 comments

Always on the lookout for code snippets, I needed a dropdown list for different company types (in the US). Believe it or not, I couldn’t find what I was looking for. So I just put together my own… I’m not sure if this is a completely comprehensive list. Let me know if there’s something else that needs to be added:

<select name='company_type'>
<option value='Corporation'>Corporation</option>
<option value='Joint Venture'>Joint Venture</option>
<option value='Limited Liability Corporation (LLC)'>Limited Liability Corporation (LLC)</option>
<option value='Limited Liability Partnership (LLP)'>Limited Liability Partnership (LLP)</option>
<option value='Partnership'>Partnership</option>
<option value='Sole Proprietorship'>Sole Proprietorship</option>
<option value='Other'>Other</option>
free md5 generator tool – password maker

January 8, 2008 2 comments

I was playing with some opensource code today.  The app tries to be secure when adding a user to the system.  When the new user is added, an email is sent off to their email address that includes the username and password.  On my localhost, I’m too lazy to setup a bunch of test email@localhost addresses, so I needed a way to figure out what the auto-generated passwords were for my test user entries.  I noticed that the app is using md5 to store the passwords.  Again, being lazy, I didn’t want to have to do “print” statements to show the md5 output.  So I created a little online tool that given any text, will generate the md5 string.  Then using my HeidiSQL tool, I just paste the new string into the password column, and viola! a new password for that user.  This way I don’t have to deal with email confirms and all that.

The free md5 generator tool is here:

