Continuation Lines
You can continue a command on multiple lines (for readablility) by using the continuation
marker, which is a space and an underscore at the end of the line, for example:
GET http://altavista.com/cgi-bin/query _
q=$keyword&kl=XX&pg=q&Translate=on _
&search.x=24&search.y=5
Equal Signs
Be sure not to put any equal signs in your data. That will confuse the web server. For
example:
SENDFORM "name=craig&formula=25 * X = 15"
That won't work, it will just send "formula=25 *", it thinks the "X = 15" is another name/value
pair.
Spaces
If there are any spaces in your data, you must surround that parameter with quotes. For
example:
SENDFORM name=craig&address=123 Main St.
Will not work! Because the interpeter thinks that "name=craig&address=123" is one parameter,
and "main" is another parameter, and "st." is another parameter.
Therefore, this command must be done as:
SENDFORM "name=craig&address=123 Main St."
Also, be careful when using continuation lines: you still need to have quotes around the
parameter if there are any spaces in it, like:
SENDFORM _
fname=Craig _
&lname=Hilles _
&comments=These are my Comments _
&email=craigh01@gmail.com
Again, this will not work. It will think the command ends at "These". So it must be
set up like this:
SENDFORM _
"fname=Craig _
&lname=Hilles _
&comments=These are my Comments _
&email=craigh01@gmail.com"
Note that the continuation lines still work even though they are within quotes.
Another Way to Handle Spaces
If any command parameter has a space, you can avoid having the WebBot command interperter
consider that to be a new parameter, buy using the tilde (~) character. For example,
the in the command:
POST http://alltheweb.com/add_url.php3 _
url=$url _
&email=$email _
&Submit=Add~URL
The "Add~URL" part will be interpreted as "Add URL" by WebBot.
Debugging
Click the DEBUG checkbox on the command execution form to create a debug file, called
DEBUG.TXT, in your application directory. This will trace each command executed, to help
you see what's happenning in your script.
The Recorder
The RECORDER is used to record scripts automatically. It uses a primitive web browser that
will display any relevant forms, links, and text. You can browse the web and record what
you do into a script file. This script file will not have any looping commands such as
GOTO, or other control commands. You can "flesh out" your script later with loops and
so on. The best thing about the recorder is that it figures out all the name/value pairs
in forms for you: like, name=joe&address=mystreet&zip=12345. You don't have to "view source"
to figure out the input names in the form: the Recorder will do that for you.
There are 2 ways you can record: DIRECT or SEARCH. In DIRECT recording, all variables are
sent exactly as found, including hidden variables. So it's nothing more than a series of
GET/POSTs to URLs with exactly the hidden variables and screen variables you enter.
The DIRECT method will work on very simple forms, such as most of the FFA site forms, for
example. However, those forms that generate certain hidden variables ON-THE-FLY will NOT
work this way, such as the SUGGEST A SITE link on yahoo: It's got a /fast/add?xxxxx where
xxxxx is a number generated on the fly, say 185721. If you just use 185721, it won't work,
you have to use the number generated by the site - it will be different every time.
In situations like this, you must use the SEARCH method. In this case, the generated code
SEARCHES for the link or the form (SEARCHLINK or SEARCHFORM commands). These commands pick
up any hidden variables. Then, the generated code does a SENDFORM with any variables you
entered on the form (text boxes, textarea boxes, checkboxes, drop-down boxes, etc).
Using the SEARCH method, the generated code searches for whatever text happened to be on
that link, or if it's a form, it searches for whatever text is on the SUBMIT button.
This may need to be modified, for example if you hit a link called "Games", there might
be more than one link on that page with the word "Games" in it. You may need to use the
EXCLUDE option of SEARCHLINK/SEARCHFORM, or make the search more specific in another way.
So, bottom line, you may need to modify the generated code that comes out of the SEARCH method.
With the DIRECT method, you probably won't need to modify the code at all: it should work
exactly as generated.
Items on the Recorder screen:
Go to URL
Use this button to go to the URL that you enter in the URL box at the top. This will also
be recorded.
Print Forms
This will print (to the screen) the contents of an internal array that stores all the
information about any FORMS and LINKS on this web page. This can come in handy to see
all the forms, input variables, and hidden variables on that web page.
Back
This is like the BACK key of a web browser. Use it to go to the previous page. This will
also record a BACK command in your record file.
Pause Recording
Use this to PAUSE recording, if you are browsing through some web pages you are not interested
in. This button will then toggle to the "Resume Recording" button.
Resume Recording
If you've paused recording, you may now RESUME recording.
Set Variables
This is a very important button. Using this, you can set variables in your script: it will
put a SET command in your script. IE: SET $name = "Craig Hilles". After that, you can
enter the VARIABLE in the actual input box, instead of the value! So, say a web page form
asks for your name. You can set the variable (as above), then just put $name in the box.
When you SUBMIT the form, the pseudo-browser will send the VALUE of your $name variable.
But in the recording file, it will put just the VARIABLE.
MODIFYING PRE-PROGRAMMED SCRIPTS
WebBot comes with many pre-setup scripts. You simply have to make a few small modifications
to the scripts to fit your needs. The scripts so far are:
- WebPosition.wbc - Find your position based on your keywords in many search engines
- Submit.wbc - Submit your web site to many search engines
- Submit_Yahoo.wbc - Automatically submit your sites to Yahoo!
Note that many of these scripts have SUBPROGRAMS. So if you list all the .WBC files, you'll
see more than just the scripts above. Some of them are subprograms to other main programs.
DO NOT USE THE SCRIPTS AS IS! You need to modify them with your URL, your descriptions,
etc.
Here are the steps to modifing some of the scripts that come with webbot
Submit to Yahoo Script
This is one of the more complicated WebBot Scripts. Here are the steps to modify it to your
needs.
- Start at the Yahoo site, and go to each directory link until you find
the spot you want to suggest your site at.
- On each click, write down the text of the link you clicked, such as "Entertainment", then
"Games".
- Use Notepad to modify the script file SUBMIT_YAHOO.WBC
- Change the $NAME and $EMAIL variables to reflect your information
- Next comes a group of commands for EACH PAGE you are submitting. Do the following steps
for each of your web pages. DELETE any extra pages if you have less pages than I have.
COPY more blocks if you have more web pages than I have.
- Set up the $title, $url, $comments, and $info information for this web page
- Note: $info is the "comments to editor" section of the submission
- Change each $dir entry to the text of the links you clicked and wrote down earlier
- Bank out any $dir entries you don't need
- The $excldir is to EXCLUDE a certain string, for example, if there is a link for "games",
and a link for "toys and games", and you want to hit the "games" link, you could set
$excldir = "toys".
COMMANDS REFERENCE
In version 1.0 the script commands are very limited. I put in just enough flexibility to
get done what I needed to do. As we get more users of the product, there will certainly be
much more power built into the script language. For those that invest the $49.95 in WebBot,
you get FREE lifetime upgrades. Following are the currently supported commands.
The BACK command is like hitting the "BACK" key on your browser. Normally you would not
use this command in scripts you write: this command is usually generated by the RECORDER:
Because when recording you may not know exactly step-by-step which pages to go to in
advance - thus you sort of "browse", which neccesitates the use of the BACK key.
When writing a script from scratch, the sequence of web pages is known, so it would be
unlikely to use the BACK command. Although, you could script something like, getting
a list of results from a search engine, going to some of the links, and then hitting
the BACK key, and going to the next link. However, the FOLLOWLINKS command would probably
make more sense for that type of operation.
The BACK command has no parameters.
To insert a comment into your script, simply make the first character of the line a
semicolon (;). You can also have as many blank lines as you wish in the script. Additionally,
you can indent lines: indentation is ignored by WebBot. This helps the readability of the
script within loops.
DO subroutine_filename
| subroutine_filename | the name of the file containing the subroutine to execute
| |
The DO command is how WebBot implements subroutines. Each subroutine should be stored
as a seperate text file (with extension .WBC). Note that WebBot script files are nothing
more than straight text files: they should be edited with Notepad, or if using Wordpad
saved as a text file WITHOUT formatting.
You can have as many levels of subroutines as you want, but at some point the program
may experience a "stack overflow". I usually don't go farther than 3 levels down.
Use subroutines to avoid repeating code. For example, if you are going to submit multiple
web pages to NorthernLight.com, write a subroutine such as "sub_nlight.wbc". Then in
the main program, you can set up variables, and then submit multiple pages, such as:
set $url = "http://www.aaagames.com/graffiti.htm"
set $desc = "Graffit Wall: like the old Graffit Wall programs on BBS systems.
set $keywords = "graffiti wall"
do submit_nlight
set $url = "http://www.aaagames.com/trivia.htm"
set $desc = "General Trivia Game"
set $keywords = "Web Trivia"
do submit_nlight
The ENDIF statement is used to terminate a block of statements after the IF, IF_FOUND,
IF_NOTFOUND, IF_ISEMPTY, or IF_NOTEMPTY statements. See the documentation on these IF
statements for more information and example. The ENDIF statement has no parameters.
The EXIT command exits from this subroutine, going back to the calling program.
If you are in the MAIN program (not a subroutine), the EXIT command will stop the script,
having the same effect as the STOP command.
GET url name_value_pairs
| url | the URL to GET or POST to
| | name_value_pairs | a string of name/value pairs such as name=Joe&address=132 Elm
| |
The GET command will usually be the first or one of the first commands in your WebBot script.
The first thing you do is to GET (or POST) to a certain URL's form, with certain variables
(name/value pairs). This gets the whole ball rolling.
In the GET command, you need exactly 2 parameters: the URL, and a string of Name/Value pairs,
like this:
GET http://www.http://www.webcrawler.com/cgi-bin/webQuery searchText='Web Trivia'&userid='Sam'
Note that the Name/Value pair string MUST be a full text string without spaces, with the
name/value pairs separated by ampersands, ie: user=sam&password=sammy&orderno=1
GOTO label
| label | The LABEL statement to go to
| |
Use the GOTO command to implement loops. Normally your GOTO command will create a loop:
a series of statements that are executed until some condition is met, something like
this:
label loop
searchtext $mypage
if_found
endif
sendform
goto loop
label end
label found_it
IF expr1 operator expr2
| expr1 | A variable or a constant
| | operator | =, !=, >, <, >=, <=
| | expr2 | A variable or a constant
| |
Use the IF statement to compare 2 variables, or compare a variable to a constant. The
IF statement must be terminated by an ENDIF. If the statement is true, the statements
between the IF and ENDIF statements will be executed, for example:
if $pageno > $maxpages
endif
The operators that can be used are equal (=), not equal (!= or <>), greater than (>),
less than (<), greater than or equal to (>=), less than or equal to (<=)
The IF_FOUND statement is used to check the results of a SEARCHFORM, SEARCHLINK, or
SEARCHTEXT statement. For example, you may want to do a SEARCHTEXT to see if the
word "success" exists in the web page. It might look something like this:
searchtext "successfully submitted"
if_found
endif
goto its_bad
As with the IF statements, all the statements between the IF_FOUND and ENDIF statements
are executed if the expression is true. The IF_FOUND statement has no parameters.
IF_ISEMPTY variable
| variable | The variable to check
| |
The IF_ISEMPTY statement is used to see if a variable is empty: this means, if it is a
numeric variable, that the variable is ZERO, and if a string variable, that the variable
is a null string. Again, the IF_ISEMPTY statement is terminated by the ENDIF statment,
for example:
This is useful for subprograms that need to check if the calling program has set a variable.
For example, in the SUBMIT TO YAHOO script, the calling program sets up the directory
entries to click. Between 1 and 6 directory levels can be set to click. If the particular
submission only goes down 3 directory levels, $dir4, $dir5, and $dir6 will be empty. Thus
the use of the IF_ISEMPTY statement.
IF_NOTEMPTY variable
| variable | The variable to check
| |
The IF_NOTEMPTY statement is the opposite of the IF_ISEMPTY statement: it simply checks
if the variable has a value other than zero (for a number), or a null string.
See the IF_ISEMPTY documentation for more information.
The IF_NOTFOUND statement is used to check the results of a SEARCHFORM, SEARCHLINK, or
SEARCHTEXT statement. This works just like the IF_FOUND statment, except that is checks
to see if the last search statement FAILED. See the IF_FOUND documentation for more
information.
LABEL label_name
| label_name | a name that is used to GOTO this label
| |
The LABEL command is used in conjunction with the GOTO command: it gives the GOTO command
a place to go. Yes, I realized this is an affront to "structured programming", but I guess
I'm a rebel! Anyway,
LET variable1 = expr1 operator expr2
| variable | the variable to change. This MUST be a variable
| | = | the equal sign must be the second parameter
| | expr1 | a variable or a constant
| | operator | plus (+), minus (-), divide (/) or multiply (*)
| | expr2 | a variable or a constant
| |
The LET command is used to change a variable. Many times this will be used to update a
counter, as in:
let $numpages = $numpages + 1
If the variables are STRING variables, only the plus (+) operator can be used, in which
case the strings are concatenated.
POST url name_value_pairs
| url | the URL to GET or POST to
| | name_value_pairs | a string of name/value pairs such as name=Joe&address=132 Elm
| |
Please note that the POST command is exactly the same as the GET command, except that it
uses the POST protocol of HTTP instead of the GET protocol. Refer to the GET command for
further information.
PRINT mode string1, string2 ... string100
| mode | T = Table, L = Line
| |
Use the PRINT statement to produce a report from your script. It is not required to have
a report in your script, or the report can be something simple such as "successfully submitted
TRIVIA page to WebCrawler", or the report can be a table that shows the results of your
script.
The first parameter to PRINT must be T or L. Use L to just print a simple line, with
whatever strings or variables you wish to print. Use T to print in a table format. All
reports are created on disk as an HTML file, which is then displayed by WebBot, or can also
be displayed by your web browser.
After the T or L parameter, pass any number of strings or variables. If it's a variable,
the value of the variable will be printed.
SEARCHFORM number string
| string | the string to search for
| |
SEARCHFORM searches the page for a certain FORM that you want to submit, based on a text
string. For example, you may want to hit the NEXT button. So you SEARCH the form for "next".
SEARCHFORM will find and store all the HIDDEN variables in that form, and pass it along to
the web site. So, for example, the NEXT button on a search engine typically has some hidden
variable that keeps track of what your starting item is. These variables are passed along,
(when you use the SENDFORM command), so the web site SHOULD react just as if you hit the
button from your web browser.
SEARCHLINK number string [exclstring]
| number | a number from 1 to 10: the occurance you want to use
| | string | The string to search for
| | exclstring (optional) | a string to EXCLUDE
| |
SEARCHLINK is just like SEARCHFORM, except that instead of searching for a FORM in the page,
it searches for a HYPERLINK. Like a form, the hyperlink may pass along variables, so the
SEARCHLINK command stores these variables. Like the SEARCHFORM command, this command is used
to find the link to "click". After that, you use the SENDFORM command to actually send
to the web site, as if you "clicked" on that link.
NOTE: if there are LESS occurances than your number, it will take the LAST one.
You may also specify a string to EXCLUDE. For example, say there is a link for Toys & Games,
and another link for Games. You want the link for games. Therefore you can EXCLUDE "Toys",
thus it will pick up the link you wanted.
SEARCHTEXT searchstring
| string | the string to search for
| |
This simply searches the entire web page, including any hidden strings, for the string
you are searching for. For example, in my script to find what page aaagames.com is
listed on, on each page I do a SEARCHTEXT "aaagames.com". There must be exactly one
parameter to this command: the search string. If there are spaces in your string,
you must enclose the string in quotes, otherwise you can leave off the quotes.
SENDFORM [name_value_pairs]
| name_value_pairs (optional) | A string of additional name/value pairs to send
| |
The SENDFORM command is used to POST (or GET) a form or link that you had previously found
with the SEARCHFORM or SEARCHLINK command. SENDFORM must always be used in conjunction
with a SEARCHFORM or SEARCHLINK command.
SENDFORM will pass to the web site all the hidden variables that were picked up in the
SEARCHFORM or SEARCHLINK, as well as any additional name/value pairs you wish to send along.
For example, suppose you are scripting a web site login. You know that you will get
a userid/password screen after doing an initial GET command. So you do a SEARCHFORM
to search for that userid/password form. There may be some HIDDEN variables in the
form, which may CHANGE each time you log in. So you can't just hard-code the variables
with the GET or POST command. You have to do a SEARCHFORM to search for the literal text
that is on the SUBMIT button of the form, then do a SENDFORM to send your userid/password
(known in advance), along with any hidden variables, for example, if you (for some reason)
wanted to automate logging into my trivia game at aaagames.com, it would look something
like this:
;Go to the Trivia page
get http://www.aaagames.com/trivia.htm
;Find the link to run the trivia game
searchlink 1 "play trivia"
;click the button to play trivia
SENDFORM
;Find the "Enter password" button on the userid/password form
searchform "enter"
;Send the form, with hidden variables, along with my userid and password
sendform u=munge&p=abcdefg
SET variable = value
| variable | the variable for which to set a value
| | = | the EQUAL sign must go here
| | value | the VALUE to set the variable to. Can be another variable or a constant
| |
The SET command is used to set an initial value to a variable. Any variables that are used
in the script must first be defined with the SET command. For string variables, set it
to a value within quotes. For numeric variables, set it to a number. Example:
Or...
set $email = "craigh01@gmail.com"
The STOP command ends the script. This command will completely stop the script, regardless
of how many levels deep in a subroutine you are.
If you wish to just stop, or exit from, the current subroutine, use the EXIT command.
The STOP command has no parameters.
Custom Programming
I can do custom modifications to WebBot, and build custom scripts. My rate is $50 per hour,
and the price is a FLAT rate based on my estimate of hours. (It always takes me longer to
do than I estimate, so that's a bargain for you). Your requirements may be sponsoring
improvements to the program as a whole, if they make sense for general users of the product.
Contact me at:
Or, believe it or not, a phone number (hmmm PHONES, these might have some applications for
business!):
(330) 676-0705 Ask for Craig Hilles
FREE Custom Programming
If you invest the $49.95 in WebBot, we will CUSTOM-PROGRAM your first script for free!
This is subject do "doability", IE: is it doable by WebBot. We will make the decision as
to whether the script you are requesting can be done with a minimal effort and/or minimal
modifications to WebBot itself. We'll let you know if your script can or cannot be done
by us before you make your investment in Webbot.
Shameless Self-Promotion
In this section we compare our product to many other programs out there that automate
tasks on the web.
WHY WebBot is better than Web Postion Gold
- With Web Position Gold, you can report on those items that are built into Web Position Gold
- With WebBot, you can report on ANYTHING
- With Web Postion Gold, if the web sites change their parameters, you must download and install
an update from WPG. Are you sure WPG is going to be around X number of years from now?
- With WebBot, you can just change the script! It WILL work, FOREVER!
Ordering Information
To order WebBot, please send a check or money order for $49.95 to:
Online Consulting
P.O. Box 1383
Kent, OH 44240
Or, you can order online using visa/mastercard at:
http://www.castle64.com/webbot.htm
In case that web site is down, try my other web site at:
http://www.aaagames.com/webbot.htm