This article explains how to implement session tracking using
two of the simplest & oldest methods available to programmers.
I feel that in order to understand the beauty of new technologies
that exist today it is often necessary to understand what used
to be done before that technology came into being. The techniques
presented in this article do not use the new technologies present
to implement session tracking, but use some old, tried and tested
ways which are extremely popular even today. After reading this
article you would be able to implement session tracking using
any language, since you would understand the concepts of session
tracking rather than some language dependent implementation of
session tracking.
Various languages provide higher level API for implementing session
tracking. There is a detailed session tracking API available in
Java which enables many programmers to get session tracking implemented
quickly and easily. But that is not what this article talks about.
It focuses on understanding the basic techniques so that you can
use it with any language.
To understand this article you need to know 3 things -
1. Familiarity with any server side technology such as JSP, ASP,
Java servlets, etc.
2. You need to know HTML very well.
3. You need to know how to access the contents of a HTML Form
from within a programming language such as JSP, ASP, etc.
What is session tracking?
Session tracking (for those who haven't heard of it) is a concept
which allows you to maintain a relation between 2 successive requests
made to a server on the Internet. Whenever a user browses any
website, he uses HTTP (the underlying protocol) for all the data
transfers taking place. This ofcourse is not important to the
user. But it is for you as a programmer. HTTP is a stateless protocol.
When a user requests for a page the server returns that web page
to the user. When the user once again clicks on a new link the
server once again sends the new page that was requested. The server
(because of the use of HTTP as the underlying protocol) has no
idea that these 2 successive requests have come from the same
user. The server is not at all bothered about who is asking for
the pages. All it does it return the page that has been requested.
This is exactly what stateless means. There is no connection between
2 successive requests on the Internet.
What does HTTP being stateless have to do with session tracking?
There are many instances where some sort of connection is required
between 2 requests made by a user. And since all transfers on
the WWW use HTTP at the lowest level this sort of connection cannot
be made. For example if you are at a website buying books online,
then you may add books to your Cart and continue searching for
more books. Every time you click on a new page your old selected
books in the Cart should not disappear. In case you use the default
way the WWW works, then since 2 successive request (by the same
user) have no connection, there would be no books in your Cart
every time you click on a new link. I mean every click would be
considered as a separate request and no having no relation to
previous request. Thus as you browse, all the information that
relates to you should be maintained and should be carried on as
you browse more and more. Your previous Shopping Cart contents
should be present when you want to add a new book to the Cart.
This is what session tracking enables you to do. It lets you maintain
a active session as long as you are browsing. And it gives HTTP
a sort of new quality with every successive request having some
relation to previous requests within the same session.
Session tracking is so common that you may not even realise that
it is present. You might be used to it. It is used on almost every
possible site you visit on the net. For example at Hotmail once
you enter your username-pass and you reach your inbox, had there
been no session tracking then every time you click on a particular
link in your inbox, you would be asked for your password. This
would be the case since there would be no way to understand that
the one who had originally entered his username-password is the
same person who is currently asking for more pages. Session tracking
allows you to store the information that you have successfully
logged in and this information would be checked every time you
do any thing within your inbox. Thus you would not be asked to
enter your password with every click. I can give you thousands
of examples where session tracking is used, but I guess you have
got the point.
Now lets begin with the actual way to implement session tracking.
I shall explain 2 ways to implement session tracking
1. Hidden Fields In Forms
2. URL Rewriting
Also I conclude the article with a few lines on cookies which
is also used for session tracking.
Hidden Fields In Forms
This is the simplest and most easy way to implement session tracking.
I find this method extremely useful to get the work done quickly.
I can explain this with the help of the example I was speaking
about - A Cart to hold your books.
In case you visit a site and you are presented a list of books
with checkboxes next to each of them. You could select books and
click on a Add to Cart Submit button. A sample code for such a
page is shown below.
Remember this is just what the code may look like and not the
exact page. You should try to understand the logic rather than
focus on the syntax. Also remember that these are all dynamic
pages being generated using some language such as JSP.
<b>Search
results for books</b>
<form method="post" action="serverprogram.jsp">
<input type="checkbox" name="bookID"
value="100">Java Servlet Programming<br>
<input type="checkbox" name="bookID"
value="101">Professional JSP<br>
<input type="submit" name="Submit"
value="Add to Cart"><br>
</form> |
Suppose
a page similar to the above one was generated when the user searched
for some books. The above page has only 2 search results. There
is a Form with 2 checkboxes, each next to the name of a book and
a Submit button to add any selected books to the Cart.
Now suppose the user clicks on the checkbox next to book named
'Java Servlet Programming' , and then clicks on the Submit button.
Note that the value of a checkbox is used in this case
to store the bookID. Generally when you have many checkboxes each
representing one-of-many kind of entity then the value
for that checkbox differentiates between all of them. In our case
since all the checkboxes represent books, each value represents
a different bookID and thus a different book (one book-of-many
books). This is actually a programming concept you would be familiar
with in case you have done web programming.
Now coming back to the point, in case the user checked the checkbox
next to the book named 'Java Servlet Programming' and then clicked
the Submit button, the contents of the form are all bundled together
and sent to the server side program. In our case the program is
named addcart.jsp . Now suppose at any further instant when the
same user is searching for more books then on a search result
he might be presented with page such as the one shown below. Remember
that he has already selected a book previously. So that book should
be present in his Cart and now he would like to add more books.
<b>Search
results for books</b>
<form method="post" action="serverprogram.jsp">
<input type="hidden"
name="bookID" value="100">
<input type="checkbox" name="bookID"
value="150">Teach yourself WML Programming<br>
<input type="checkbox" name="bookID"
value="160">Teach yourself C++<br>
<input type="submit" name="Submit"
value="Add to Cart"><br>
</form> |
Those of you'll who are experts in programming must have already
figured out how hidden fields help in session tracking. For the
rest of you'll who are like me and take more time to figure out
what is happening, let me explain..
The new search result produced once again 2 new books. One book
named 'Teach yourself WML Programming' with a bookID of 150 and
another book named 'Teach Yourself C++' with a bookID of 160.
So a form was generated with the names of these 2 books and with
2 checkboxes so that the user may select any of these books and
add them to the Cart. But there is one more important thing in
the form that was generated. There is a hidden input field named
bookID and having a value of 100. You might have noticed that
100 was the bookID of the book named 'Java Servlet Programming'
which the user had initially selected. This line describing a
hidden input does not make any difference on the HTML page displayed
in the browser. It would be totally invisible to the user. But
within the form it makes a hell lot of a difference. This way
when the user keeps adding more and more books, there would be
many hidden input fields each with a different value, each representing
a previously selected book. When this form is submitted to the
server side program, that program would not only fetch the newly
selected checkboxes (newly selected books) but also these hidden
fields each representing a previously selected book by that user.
Note that all the input fields have the same name bookID but their
values are different. Within the server side program you would
simply expect a parameter called bookID which would be an array
with different values. You could extract all the values and then
use them as required. It is the job of the server side program
to add these lines indicating hidden fields whenever it generates
a new page.
Once again..the main concept to be understood is that a hidden
field displays nothing ON the HTML page. So the
user who is browsing the page sees nothing unusual, but the value
associated with these hidden fields can be used to hold any kind
of data that you want. Only care is to be taken so that every
time your server side program generates a new form, it should
read all the parameters passed to it from the previous form and
then add all these values as new hidden fields in any new form
that it generates. Thus you could carry information from one HTML
page to another and thus maintain a connection between 2 pages.
The disadvantage of session tracking is that in case you do not
want the user to know what information is being passed around
to maintain a session (in case that information is somewhat vital..maybe
a password or something) then this method is not the best one
since the user can simply select to View the Source of the HTML
page and get to see all the hidden fields present in the Form.
URL Rewriting
This is another popular session tracking method used by many.
But it has a few bad points associated with it. Inspite of that
I like to use this method. It doesn't require a lot of understanding
to get the work done. URL Rewriting basically means that when
the user is presented with a link to a particular resource instead
of simply presenting the URL as you would normally do, the URL
for that resource is modified so that more information is passed
when requesting for that resource. I can see puzzled faces trying
to make sense of what is written above.. Read on and things shall
get more clear...
I will try explaining URL Rewriting with the same Shopping Cart
example used in the hidden field method. Actually I could have
shown simpler examples, but for you to compare the 2 methods I
shall take up the same example once again.
So once again assume that a user has searched for some books and
he has been presented with a search result that has 2 books listed.
It is basically a Form with 2 checkboxes, each for one book and
a Submit button to add any of these book to his Cart.
<b>Search
results for books</b>
<form method="post" action="serverprogram.jsp">
<input type="checkbox" name="bookID"
value="100">Java Servlet Programming<br>
<input type="checkbox" name="bookID"
value="101">Professional JSP<br>
<input type="submit" name="Submit"
value="Add to Cart"><br>
</form> |
Now once again suppose the user selects the book named 'Java Servlet
Programming' and then clicks on the Submit button. This would
pass the contents of the form to the server side program called
serverprogram.jsp which should read the selected checkboxes and
do the necessary (i.e.. make some arrangements to keep a track
of the selected books, which basically means implement session
tracking). Now suppose the user continues browsing and searches
for more books and is presented with a new search result just
like in the previous example. For better understanding I shall
once again give you the same 2 results as shown in hidden fields
method. The 2 books named 'Teach yourself WML Programming' and
'Teach yourself C++'
<b>Search
results for books</b>
<form method="post" action="serverprogram.jsp?bookID=100">
<input type="checkbox" name="bookID"
value="150">Teach yourself WML Programming<br>
<input type="checkbox" name="bookID"
value="160">Teach yourself C++<br>
<input type="submit" name="Submit"
value="Add to Cart"><br>
</form> |
You
should be able to guess by now what URL rewriting is all about.
In the above html source, the target for the form has been changed
from serverprogram.jsp to serverprogram.jsp?bookID=100 . This
is exactly what URL Rewriting means. The original URL which was
only serverprogram.jsp has now been rewritten as serverprogram.jsp?bookID=100
. The effect of this is that the any part of the URL after the
? (question mark) is treated as extra parameters that are passed
to the server side program. They are known as GET parameters.
GET method of submitting forms always uses URL Rewriting. Now
when the serverprogram.jsp fetches the parameters by the name
bookID it would be presented with the one that was present after
the ? in the URL as well as the newly selected checkboxes by the
user in that Form.
Consider a general example where a user has selected 2 values,
then whenever a program generates a new Form the target for that
form should look something like
<form
method="post" action="serversideprogram.jsp?name1=value1+name2=value2"> |
This
sort of URL would keep on increasing as more and more values have
to be carried on from one page to another.
The basic concept of URL Rewriting is that the server side program
should continuously keep changing all the URLs and keep modifying
them and keep increasing their length as more and more data has
to be maintained between pages. The user does not see anything
on the surface as such but when he clicks on a link he not only
asks for that resource but because of the information after the
? in the URL he is actually sending previous data to the program.
The disadvantage of URL Rewriting (though its a minor one) is
that the displayed URL in the browser is of course the rewritten
URL. Thus the clean simple URL that was seen when hidden fields
were used, is replaced with a one with a ? followed by many parameter
values. This doesn't suit those who want the URL to look clean.
Another disadvantage is that some browsers specify a limit on
the length of a URL. So once the data which is being tracked exceeds
beyond a certain limit, you may no longer be able to use URL Rewriting
to implement session tracking. But that limit is generally large
enough and so don't feel afraid to use this method. But do note
that actually rewriting all the URLs within your program is not
a simple task and requires some experience.
In case you are confused with what we have been doing with hidden
fields and URL Rewriting, I shall sum it up once again for you.
We are trying to learn methods that allow us to carry information
from one HTML page to another since by default you cannot pass
information from one HTML page to another. So to carry data from
one page to another, we are either using hidden fields invisible
to normal users or rewriting all the links on a page so that the
server side program receives the old as well as new data. Thus
we can maintain a session (a connection between multiple pages)
for every user.
Cookies
This is one of the most famous methods and the one used by almost
all professional sites. This allows you complete flexibility and
whatever you want as far as session tracking is concerned. But
it is not as easy as the other 2 methods. Besides some applications
may not allow cookies in which case you have to revert back to
the other 2 methods. I had designed websites using WML (Wireless
Markup Language) which worked on WAP based cell phones. Unfortunately
the cellphones did not have enough memory to support cookies,
so I had to use hidden fields to get session tracking working.
But cookies would work on almost every every computer, except
when a user may have blocked all cookies for security reasons
in which case you would once again have to use either of the other
2 methods.
There will be no code here to explain cookie usage. Using cookies
is probably the best and the neatest of all the methods to maintain
sessions. Cookies are basically small text files that are stored
on the user's computers. This has information pertaining to that
user. Once the cookie is created on the user's computer then for
every further request made by that user in that session, the cookie
is sent along with the request. The value of every cookie is unique
(for users browsing a particular website), so the server side
program can differentiate between various users.
The method to program cookies is different for different languages.
Most of the language provide some class that covers all the details
of cookie creation and maintenance. For example in Java you have
a javax.servlet.http.Cookie class that is used to work
with cookies. Since I have decided to keep this article language
neutral and I had not planned to discuss cookies in depth I would
not go into the details of cookie programming.
Finally...
For beginners however I suggest any of the first two methods to
implement session tracking. Rather the facing the learning curve
associated with cookies you could manage with one of the above
2 techniques that you could implement using any language. My first
preference is always for hidden fields. But in cases where I am
not dealing with forms as such (which generally doesn't happen)
I also use URL Rewriting.
Hope this article gave you a sound introduction to session tracking.
I am sure you can use the knowledge presented here for you personal
programming needs. However in case you plan to implement a professional
website then I would suggest you to look into APIs specifically
designed for session tracking which would do all the above mentioned
stuff for you automatically without you worrying about the nitty-gritty
details.