Monday, May 25, 2009

catching captcha image part 1 C#

Hey there,
Today I had a nice experience with a spammer software which was written by and it was packed by .net reactor .
I was astonished that how come the programmer was able to trick blog site according to their captcha system.
because their way of implementing the captcha system is horribly hard to expose , sometimes I was thinking by myself that how can I mess with this captcha thing.
Well today I was lucky and got my answer and that poor programmer was unlucky .
I unpack his code and understand several things which I'm going to sum up them :
1- if you wana write a spammer most of the times you can't use webclient or webrequest classes well you should do it by the help of webBrowser or shdocvw which ever you are comfortable with.

2-suppose you navigate a site , something like "" you should wait until the document load full in your container how can we wait that much?
using DocumentCompleted event
using something like following:
while (webBrowser.ReadyState != WebBrowserReadyState.Complete)
until it is loaded it is not going to do anything except loading your page.

I cross my heart that just two things I've learnt from that lame programmer, I got rest of these tips by my own A$$:

3-In some websites getting captcha picture is a hard job first of all maybe it's not going to show itself at first until you fire some other events like focusing on some textbox or clicking a button well you should use a bit of knowledge of how to deal with javascript and how to find the badboy in a goodplace, well I did it by the help of firefox and firebug to find the which script was being run (I'm not gonna teach how to use firebug).

to execute the script by your self and on your own way you can accomplish this task by using :

4-sometimes the web designers fantasize that they are very clever and they can do something that no one else can do that in all around the world !
can you imagine how stupid they are???
well they are going to show captcha image without any name or id tag i mean something like below:

< src="hxxp://">

as you can see there is no clue that gonna help us to get the src tag.
well how can we overcome this problem? easy peasy japanesey

with the help of foreach{} statement and search inside htmlelement we can find out our sweat captcha , you can see in the following :

foreach (HtmlElement img in wb.Document.Images)

isitcaptcha = img.OuterHtml.IndexOf("Captcha", 0);
if (isitcaptcha > 0)
captchaimg = img.OuterHtml.ToString();

5-well we've got our captcha source till now, then we have to be nimble and show this captcha quickly otherwise again the web designer will think that he was much smarter than you.
by showing the image inside a webBrowser you can have a copy of that captch on your own.

6-last and worth one : how to save that goddamn captch image out of my webbrowser?
believe me or not it is a very tedious job I was searching around 6 hours through the libraries (I mean .net library not a goddamn ordinary library where people going there and read Harry Potter)
Internet and my books until I found this useful webblog and his tricky way of saving webbrowser pictures.

below you can find a very first edition of what I've done :

inside code you will find a bunch of comments nonsense don't worry they are written in Persian and they won't harm you ;)

next episode I will teach you how to OCR captcha and retrieve it as a text :D


No comments:

Post a Comment