Passing Data through the Window : Using Sockets

Now that we know how to write a client and a server in very simple terms, lets see how a real world client server pair works˙.

The first thing you should be wondering about is how can the same port of the server handle so many client requests ? This thing is called concurrent server. When you make a server listen through a socket and a port, you are not actually assigning the same socket to respond to the requests. When a client connects to a server’s listening socket, the server redirects the connection to a new socket , while the listening process continues on the same socket. When you see the upcoming server code you’ll see how this thing works.

for example, if listeningSocket is a socket listening on some port and it gets connection request from a client, then it goes for assigning a new socket to the client to connect and then continues listening.by the function call:

(redirectedSocket,clientConnected)=listeningSocket.accept() // you must have guessed that the function returns a tuple.

Another thing is about TCP which you’ll observe in the code below. The sockets you’ll be seeing are all TCP. So, you need to be sure that all the data that you transferred on client side is received on the server end. (or the vice versa)

Now although we see sockets as abstraction, there is a lot of events going on behind the scenes when you send data through the socket. Say you want to send a string across a socket, then your string may be packed into one or more packets, for which you might never know the number of packets. say we do mySocket.sendall(‘Hello I am Muktabh Mayank’) . So this string is to sent across the network, but then there are many packets in which this string will be transferred and you dont know the number.

Also these packets may need to be resend if the acknowledge packets for them are not received, so sendall() is a function which needs to occupy the port’s data output whatever code executes after it. Its exactly like socket.listen() which continues listening on some port even if some code has been executed already. The only way to free a socket after you have used sendall() function is to close the socket already. Such functions are called blocking functions. To visualise you can think of this, you cant use 2 sendall()s one after the other, until unless you close the socket in between and then reconnect through another socket.

recv() which is the function to receive what is being sent by the sendall() is not a blocking function and hence needs to be used in an infinite loop, so that the port is occupied once it starts receiving, so that no packet is missed. send() the function wrapper for actual system call is also available, but here Ill be using sendall().

Another thing, my code works on Linux system, and I have not tested it on windows and I ve heard some old version of Python had a problem with networking interface in Windows (which later turned out to be a windows fault and not Python from the article I read), so try using Linux :

first the server code :

(I dont know why My indents arent visible in my post, make sure you put indents after try , except, while, if etc blocks else code wont work)

#!/usr/bin/env python

import sys,socket

host=” #We didnt fix the hosts to which we’ll cater

port=3333

try:

newSocket=socket.socket(socket.AF_INET,socket.SOCK_STREAM)# An IPV4 socket in                                                                                                                                     #TCP

except :

print ‘Cant Create Socket’

try:

newSocket.bind((host,port)) # we are binding because

# Since This is the server side the socket must have fixed port address to listen to requests

except:

print ‘cant bind’

sys.exit(1)

#fixed port address coz you have to connect to it, In client there was no such fixed port

print ‘Server Will now run on port %d’%port

newSocket.listen(1)# Now its ready to take in any connections from clients]

try:

clientSock,clientAddress=newSocket.accept() #new connections redirected to other sockets

# So that the port continues to listen

except:

print ‘cant connect’

sys.exit(1)

print ‘connected to %s’%(clientAddress,)

print ‘client says :’

while 1 :

buf=clientSock.recv(4096)

print buf

if not buf : # a None is received if the sender socket closes down

break # leave the while loop

sys.exit(0) # successful termination

Now the Client code :

#!/usr/bin/env python

import sys,socketport,hostName,message=3333,’localhost’,’hi’

newSocket=socket.socket(socket.AF_INET,socket.SOCK_STREAM)

try:

newSocket.connect((hostName,port))

except socket.gaierror,e :

print ‘Unable to connect, %s’%e

sys.exit(1)

print ‘message from client side %s’%message

newSocket.sendall(message)

newSocket.close()

make sure you run the server code first😛 and see your packet reach the other side of socket

My first sockets

Network Programming is a very important application of Python. It has got a big library of system calls (system calls are functions which ask the operating system to do some jobs for us, most networking functionalities which are provided in any OS are accessible through a set of system calls). Normally System call interfaces are provided in C (its C for Linux, for windows we know the format of system calls but know what language it has been coded in). Libraries in Python do nothing but define wrapper functions for these system calls, ie when you call a system call function in Python it just calls the corresponding System call in C.

So for people who are totally unaware of networking concepts, Ill direct you to Networking video tutorials by Bucky (they are the coolest tutorials i could find on net ever) on :

http://www.thenewboston.com/?p=110

If you cant play these you can search for them on You Tube.

This guy even has a set of super cool Python tutorials on the same site :

http://www.thenewboston.com/

maybe you can check them out.

After seeing these tutorials maybe you’ll like to dig a bit more, here’s a small set of free study material which will suffice for the basics.

http://en.wikibooks.org/wiki/Network_Plus_Certification

So now Ill assume you will be knowing at least what the various layers in OSI model are and what functions are accomplished at each of the layers.

First thing: What is a protocol ? A protocol is a standard format in which traffic is to be sent across the network line, so that its legible to the receiver. Essentially code to produce packets (or data unit at any other layer of the OSI) which follow these fixed rules is pre written in your OS, or some software. All you have to do is use System Calls to use this code. So you can think of Protocols as set of functions which make our data from human understandable format to Network wide understandable format.

The most common protocol at the Network Layer is called the IP or the Internet Protocol. This Protocol defines  How the Packets will look like when they are sent across a network supporting IP traffic. Now all the functionality to convert a Data Unit you create at the transport layer (also called segment) into a packet which is defined by rules of IP protocol is in the OS. All you have to do is to call a proper method (system call) which will do it for you.

At transport layer common protocols are UDP and TCP. UDP is a protocol which says if a packet is received by the receiver, the receiver will have the entire information that was on that packet (atomicity), but doesn’t guarantee that each of the packets reach the  receiver. Such type of protocols are called Connectionless Protocols. The other protocol is TCP. Any packet sent by TCP protocol is sure to be received by the receiver. In case it is not received you’ll get the failure report. TCP,UDP,IP and the set of protocols used above them in networking are called TCP/IP stack.

When you want to send some data (say x.txt) across the network. What application level things do is dividing the files into small chunks of data which after passing through the functions of TCP/IP stack, will be converted to packets which can be sent across the network. The small parts of file are converted into packets by adding some attribute value to them, called headers if in front or trailers if they are at the end.

I cant go too deep in these protocols here,but think of UDP as when you text someone on your cellphone and you don’t have the delivery report on. Now the message may or maynot have reached the receiver, we would not know.

TCP is like the texting with delivery reports on (you know what packets were received by receiver and what were not, so you can keep transmitting till you get positive delivery reports of all messages you sent, assuring you that data has been received, or you have in your knowledge that you couldn’t send all these messages), think of each text message to be one packet to be sent across the network and delivery report as a special packet called the ack packet (acknowledgement packet, which the receiver sends back to sender telling that some packet has been received). By now you will have guessed the problem with TCP , yes that is more traffic across the network. Just to make this acknowledgement mechanism possible, we have to send a lot of extra data across the network. However think of it, very few things can be in real world sent through UDP, because, we dont have guarantee that anything reaches fully, not a simple text file.

Packet is the simplest thing which your computer software understands, however data is actually sent on hardware ports and devices and lines. So we can be sure that packets arent the ultimate units which travel over the lines. Actually the hardware interfaces of your computer place headers and trailors on your packet to make frames and then these frames are converted to electrical impulses or light beams or whatever to be sent over cables or fibres .

But since Packet is the simplest unit visible to software we cant program below packet level (at least using our system calls we cant).

So what we’ll see in this and the next post is how to send data across a network. You’ll need  IP address concept if you still havent read it.

Sockets: The first thing Ill like to tell about is sockets. Socket is basically like a window, through which you can pass packet to the receiver. Its the software abstraction of whatever hardware is used after a packet is passed to layers below network layer of sender and before the network layer of receiver receives it.

So lets create a simple software connection between two computers which are already connected by hardware. The best part is you dont need to worry about hardware used to make the network. This is the essence of 7 layer model, computer hackers dont need to think about what engineers have done. (I consider myself a bit of both of the above type of people.😛 )

So how to connect two computers ? there are two possible ways :

1) Client – Server way. In this one of the machine is Server, that is it constantly receives traffic (so it needs to wait for any traffic which come to it, this process is called listening) and perform actions accordingly. The Clients have lesser jobs to do. When they want network functionalities they connect to a server.

2) Peer to Peer way. No one is server or a client. We’ll see it later.

ATTRIBUTES OF A SOCKET : Attributes of a socket in general are 3. They tell what Network Layer, Transport Layer and Application Protocol to be used.

Lets make two sockets (one for a server and other for a client):

(suppose server and client are on different machines (and hence have different IP addresses))

Another thing : the port number actually gives the name of application protocol, or in other way which application is handling the data. On server side they are quite fixed and standardised like http is 80 , and thus we cant develop on these ports. The small code we’ll write will actually be a new application level protocol using some port.

ServerCode : (On the computer which needs to act as server):

I am writing this on IDLE directly :

IDLE 2.6.6      ==== No Subprocess ====

>>>import sys,socket

>>>port,host=3333,” # Our new Application level protocol will work on port3333

>>> # for all hosts

>>> newSocket= socket.socket(socket.AF_INET,socket.SOCK_STREAM)

>>> # Created a IPv4 (Network Layer) socket.AF_INET and TCP (SOCK_STREAM) socket

>>> newSocket.bind((host,port))

#gave the port number on which server works and

>>> #passing a blank host means any client can connect

>>> print ‘Server Will now run on port %d’%port

Server Will now run on port 3333

>>> maxClient=4

>>> newSocket.listen(maxClient)

>>>

To check whether your server is waiting for connections on 3333 port use netstat command (works on both windows and linux)

$ netstat -an | grep 3333

if you do this you are shown a port like 0.0.0.0:3333 LISTEN

3333 is actually a port which some windows viruses use to host a temporary ftp server . Actual number for ftp is something else and 3333 is a free port number and hence we can use it.

Similarly we write a client code:

IDLE 2.6.6      ==== No Subprocess ====

>>> import sys,socket

>>> port,hostName=3333,’172.17.1.91’# say server is on IP 172.17.1.91

>>> newSocket=socket.socket(socket.AF_INET,socket.SOCK_STREAM)

>>> # tcp ipv4 socket

>>> newSocket.connect((hostName,port))

After this your client and server are connected. We can now exchange packets between them

Reading The Python Basics

All my posts for Python basics are in  reverse order of what I have written them in, so if you want the most basic ones go to first one or the oldest ones.

Most of the writing is inspired from Pilgrim’s dive into Python .. goto http://diveintopython.org to see more cool codes.

OOP in Python

We have been using objecs all along, but I just didn’t mention about them. You could have often thought of the fact that why some functions work by using ‘.’ operator , while the others just used parameters to take input. why was )> not written as append(,). Well this is because the append method is a behavior of the list,and its described along with other properties of list in a class, and so it just knows that the list on which it is called is one of the parameters to it.Just like bite didn’t take the dog on which it was called as an input (Although it did not use it in any case). So I think you will have already concluded before I tell you that the ‘.’ operator can be used to access the contents of an object, be they its nature (data like integer strings) or its behaviour (the functions placed in the object, typically called methods when they are members of an object).

So if List(and hence dictionary and tuples and strings) are objects, what all is an Object in Python ? You’ll here actually get the fund of Python, everything coming from the memory is an Object for Python (might sound familar to shell scriptors for whom everything coming from the memory is a file),so Lists are Objects, so are dictionaries, integers and even functions,modules and files, and then you can define your own Objects too. Integers, they didn’t seem to be Objects. Yes thats true, because the developers of Python didn’t want it to be just an object oriented language (although it basically is purely Object Oriented), they made it look like a procedural language for people who just want to use it at that level, and OOP language who want to use it for OOP. 

So files,modules,functions,lists,(other things coming from the memory) and the user defined objects all are objects, then why there is a lot of difference in using them ?

Python’s objects can be classified differently based on how much they allow the coder to oversee their functioning, some of them like integers, dont allow you to see anything, no control over them, while the user defined objects can totally be overseen for anything they do or is done to them.The types are : The inbuilt datatypes(ones that are totally inbuilt as a part of the Python language and their characteristics are totally abstracted from the users, like int, lists files etc.), The things that we define without knowing that they’ll be objects (like modules,classes and functions) and the User Defined Objects(Which we intend to use as objects totally on purpose)

Inbuilt Data Types : This contains int , float , Lists, dicts , files  etc. etc. They are totally meant for you to abstract notion of object and work on them totally as if they were supported formats to store data.(Thats what we have been doing until now) You have a set of operations you can perform on them and thats all , you are done. The implementation of these operations on Objects is python’s headache.

However you should have realised that we somehow do tell the Python interpreter what type of an object to form. For integer its simple, we just write a number; For dictionaries,lists and tuples we use the {}[]() symbols respectively, for string we use ‘  ‘ or ” “, and we tell that a file type object is to be created using the open() function.

Objects which you dont define thinking that they are objects :  These are functions, modules and classes. You think that they would just be blocks of code, but ultimately they turn out to be objects. We havent read classes yet, but Ill talk about functions and modules now.

So first lets talk about functions, Functions are implemented a bit differently than the other objects.They are written in modules, to which they bear very close relation. We already know that running of a module is performed in two steps in functions : Compilation and Runtime.

When  module is complied, the normal code in the module, ie the one written in if  __name__==’__main__’: block is converted to runnable code (in a language that the Python Runtime Environment or the interpreter can run called the code Object), checked for syntax errors and made ready to run. And then the runtime when the code is actually run line by line.

However this should not be the case with functions, as functions should not run until they are called with the funtionName(params) format. Sometimes we might have written a module with various functions and call the functions in another module. as we did in the post https://rightbrainedpython.wordpress.com/2011/01/21/many-filed-codes-and-the-use-of-tuples/ So till the compile time you should just make the function ready to be called thats all, not run the code. If you still did not realise the thing Ill tell you the exact procedure as to what happens and you can easily visualise it.

When the module is complied the function isn’t directly converted to code object, but converted in machine understandable code called byte code and then converted to code object (ie executable code ) when the module is being run. So, afetr the def func(): block has been once run, then the function object is ready to run under the name func and then only we can call it. You can try visualising the post https://rightbrainedpython.wordpress.com/2010/12/28/loops-and-functions/ once more if you still did not get it.

Now we’ll see modules as objects. Modules are essentially any .py file , so it means almost any code written in Python can be run only when its totally converted into an object. This is called code object. In runtime, each of the code object is assigned to a namesapce, ie the module which imported it asked it to run. The module which are run on command line (and by pressing f5 on your IDE by lazy people like me) are asked to run by Python interpreter whose namespace is called __main__ , and so we write if __name__==’__main__’: block.  if you import  math and then call it from your code in file myCode.py as math.atan() , then the module math runs in namespace myCode . 

So when you write some module say math as import importantFunctions , the code in module importantFuntions.py is compiled and run (but nothing actually happens as __name__==’__main__’ wont run as namespace is the python module calling it, is the name of file calling it). After the code is run once the compiled files is stored as .pyc and a referrence is made it using the name importantFunctions. and now you can run the functions in the files (which are now code objects as already the module has been run). These functions are run by moduleName.funcName(params), as when we do it the functions are called from the namespace of module (functions’ code objects and variables are stored with the namespace of module they are in).

K, So I hope I could tell you something good about these things. However we still will see about ‘class’ type objects later. For now you can think that there is a new object created when you use a def or an import or a class keyword even though you dont realise it.

USER DEFINED OBJECTS: The type of object as we made for the dog in last post using java. We’ll see how to declare its counterparts here in Python. The most important thing here to understand is that, we create a template for Python as to how to create the Objects for a new Data Type, which is called ‘class’. The class tells the Python interpreter as to how create a new object of user defined data type.Interestingly even this template called class is handled as an Object, we’ve seen that Python handles a class as an object even though we dont come to know about it.

But then, we’ll just see how to define a template for objects of a new data type.

class objectType(object):

    def __init__(self):

      self.varname1=<someValue>

      self.varName2=<somevalue>

       …….. # no return statement

def method1(self,params to method 1):

    # do stuff

…. and other methods too

The above is a general format of defining template of an Object. All the data members(the ones which tell the nature or the state of the object) are initialized with the initial values they should have at the time of creation in the __init__ method.The functions which represent the behaviour of Object are defined by def and taking the first argument as self , showing the implicit parameter (that is list is the hidden parameter in someList.append(‘damn its cool’)). It has some real time significance too which Ill tell about when I discuss classes as objects. Till then, just think of it in the above way.

So lets write a code which works on the dog object.

class Dog(object):

  def __init__(self):

    self.name=”

    self.ownerName=”

    self.age=0

  def bite(self,oneWhoWasBitByDog):

    oneWhoWasBitByDog.bites+=1

if __name__==’__main__’:

  myDog=Dog() //__init__ called with the same params you give here hence you create an object of type Dog

  myDog.name=’Rover’

  myDog.ownerName=’Muktabh’

  myDog.age=2

  idiot=Person()// some person type object

  myDog.bite(idiot)

The above code is just introductory dont run it, you can run it only after defining the person class accordingly. Anyways you got the fact that the methods of an object need to be written with self as the first param but never called by passing the object name in their param list. The parameter is hidden.

A C++/ Java guy will ask , ‘Is  __init__ constructor ?’ , Technically its not. Although it performs the same (partial) job of constructor as defining the data members of the objects but its working is not exactly the same as Java Constructor. We’ll see how a constructor in Python is written in some other post. Meanwhile I’ll put a code displaying a working object type which you can think and run and write similar codes if you want, it defines a class for Complex Numbers:

import math
class ComplexNumber(object):
    def __init__(self):
        self.x=0.0
        self.y=0.0
    def setValues(self,a,b):
        self.x=float(a)
        self.y=float(b)
    def getModulus(self):
        return math.sqrt(self.x ** 2+self.y ** 2)
    def getConjugate(self):
        return ComplexNumber(self.x,self.y*-1)
    def getPolarFormTuple(self):
        r=self.getModulus()
        if self.x==0:
            if  self.y==0: arg=None
            elif self.y<0: arg=-1*math.pi/2
            elif self.y>0: arg=math.pi/2
        elif self.x<0:
            if self.y>=0:
                arg=math.atan(self.y/self.x)+math.pi
            else:
                arg=math.atan(self.y/self.x)-math.pi
        else:
            arg=math.atan(self.y/self.x)
        return(r,arg)
if __name__==’__main__’:
    a=ComplexNumber()
    a.setValues(-4,-1)
    print’The number in polar form is r=%f theta=%f radians’%a.getPolarFormTuple()
    

OBJECTS… whats dat ??

Object Oriented Programming was a ‘in’ feature when Python was developed in early 90s, Java came into existence in the same phase as Python. So, Python joined the lines of object oriented programming languages.First we’ll see what objects and object oriented programming (here on referred to as oop) is, and then come and see how its done in Python.

We need to look through the history of programming languages to see why OOP evolved. The first computers were essentially calculators,, not even as good as the ones we use to solve optimisation problem in Engineering. They could do sum,substraction,product and division  upon integers thats all. Still they consumed as much electricity as a small town, and needed a whole building to be kept.

And not just that, most of the languages evolved for being tools to solve mathematical problems and similar things. FORTRAN one of the early languages is actually a pseudonym foe formula translator. Now, the above four operations could be a basic set of operations along with the new features added like bitshift,comparison(<,>,<=,>=,==), but you dont have an advantage over the old type of computers(the ones that were kept in an entire building) if you need to write every mathematical problem in terms of simple summations and other ops.So what we needed to do was to increase our set of possible operations beyond the basic symbols + _ etc., and soon we had functions. Almost all languages of that time had functions in them. C which was developed at that time exactly with the same purpose as Python is today had to take up this trend.

Now user could define his own operations like lcm(),maximum(),gaussian() etc. and make programming simpler by using these new operations, rather than write entire thing in form of +s and -s and*s.This solved the problem, FORTRAN was hit, C although was made for system administration turned out become the most language from that time to now. These lnguages are still considered good for writing mathematical problems, and thats why we still have FORTRAN around, which is like the grand old mother of computer languages.

But then the aims of programming languages started changing.As computers grew cheaper, we needed to make them useful for normal people so that they buy it, and since not everyone buys a computer for solving queuing Theory problem to access how many web servers are required to handle a website, or what should be the launch velocity of a module which is to raech Mars in exactly 1 year, we needed to develop applications. The languages above were terrible at developing them. GUI and modules become very tough to code in a language with same kind of stature as C . It was officially declared that languages till that time were not suitable enough to write big applications (such a condition is called software crisis).Its not impossible to write the stuff, but the procedure is tough and buggy. If you boot your Linux system, you’ll see 3 kinds of Windowing Systems : GNOME,KDE and XWindows (m talking about general people, there are more like fluxbox etc. but still). XWindows was essentially written to write apps in C, as it doesnt require OOP, but the other two, which you mostly see being used to write software, are written to be used in OOP. 

So what was OOP’s inception point ? Just like older computers needed an extension of the set of operations, so that more n more mathematical internal complexity can be abstracted when writing a code. We needed an extension of data types , ie the kind of data that can be expressed in newer languages. We needed to represent an application window as a datatype so that we can add things to it easily, not that we need to call a function on three things to add them to one of the params of functions, which was an application window. Similarly what if wanted to represent a dog, or a truck in our software ? How could you write that in C. Some people will say that I could write a structure in C, like

struct dog

{

char* name;

int age;

char* ownerName

};

struct dog myDog;

char[20] dogName;

dogName=”doggy”;

myDog.name=dogName;

But is that all a dog’s got ? A dog needs to something in an app, say bite a person, how do you write that ?? write an external function may seem to be an optimum solution here , but think of huge apps which have hundreds of such representations, dogs,trucks,men,colleges,secretaries all together. You need object where everything related to a dog is written together and everything related to a truck written together.

Lets see how we’ll collect a dog in java: (All I want to show is what all things you can put in a single module representing dog, if you can just see what all is in between {}s representing dog, and dont get a line of code, thats enough).In java we use Classes to do the same job, but we can encapsulate functions in the representation of dog too.

class Dog

{

public String name;// public means its visible from even outside the class, ie you can do a.name=”doggy”

public String ownerName;

public int age;

public void bite (Bitable b)

{

b.numberOfBites++;

}

}

As you can see her we have combined the data about the dog and the behaviour of dog (biting).

now its simple.

Dog myDog=new Dog()

myDog.name=”doggy”;

myDog.ownerName=”Muktabh”;

myDog.age=2;

Person someOneWhoWasBittenByMyDog= new Person();

myDog.bite(someoneWhoWasBittenByMyDog);

Ohhk, So now would be at least some idea of what objects are. They are made to model something which interacts with humans in human world as a data type. So apart from int, float we have user defined data types also.Objects have to obey 3 principles: Inheritance, Encapsulation and polymorphism.(Just remember that we talked of a few properties of objects, dont google them up now, untill unless you actually are totally into groove)

In next post Ill discuss how these things are taken up in Python.

Fichiers en Python (Working on a simple text file)

Files, something in which data is kept on a computer, be it a video data in an avi file or a webpage in an html file or a normal text file, is a set of related data that would be needed by a running program.

Essentially all the files are written in form of bits and bytes and hence we can work upon almost any kind of file using just a single interface to read/write files, be it avi or txt, but the thing is that files for special systems(say avi for vlc) are given formats so that the software running them can better read them, for example the HTML page which appears to be a plain text on browser is actually a set of tags and text and maybe a combined css file.So, what we can deal with the simplest interfaces possible are text files and text based files like those containing C codes, Python or shell scripts.

To learn about files, we need to learn how are they stored in a Linux system. Windows might not be having a similar system but, we can just work on windows thinking that it had a Linux like file system, you guessed it right , Python takes care of the stuff for us.

So how are files accessible to the end user ? Its provided to be used in a 3 layered hierarchy:

Process Level View (How the files are related to the process, ie our software, say our python script,gedit etc.):implemented through a data structure called file descriptor table

Intermediate View(Implements the process level view on each of the Files) Implemented through a Data Structure called File Table.

Actual Implementation of files on disk(Implements the functionalities provided by the above view on disks with the help of device drivers)

Now we as users dont need to care about the lowermost level (ie one with device driver etc.)

A File Descriptor Table is a data structure with one entry for each of the processes (softwares or programmes). You get handles to all the possible files you would be using and keep it in this entry.

A File Table contains data like what point the cursor of file is when its being read, what are the various functions which can be called on a file etc. These functions are implemented using device drivers.

Now when we open a file in Python, we get a handle to the file in our file descriptor table, so that we can read the file, get and set the position of cursor on file and write in it. However all these functionalities are implemented in the File Table entry of the corresponding file. Think of this, if two codes running parallelly on a system simultaneously read and write data, the other one can see the modifications immediately, similarly the position of cursor in the file(from where read write is being done, also called file offset) changes together for both of them. 

How to open a file ??

Make sure that you’re storing your code in the same directory as the file youre working on then only you can refer to files by names , say ‘Itachi.txt’ else you will have to work with full paths like ‘f:\myCodes\Itachi.txt'(in case your code and files arent in same directory). If you perform file ops in interactive shell (>>>) and you dont want to write the full  path names while referring to files , you should store them in lib folder of Python installation.

ohhk so the basic commands to work on a file are :

file = open(‘<fileName>’,'<opening mode>’) to open a file

opening mode can be ‘a’ (meaning append), ‘r’ meaning read, ‘w’ meaning write and ‘rb’,’wb’ meaning read write in binary format, there are others too, just google for them.. eg ‘a+’,’w+’ etc., but for this tutorial what I’ve told now will suffice.

The above file can be closed, ie the entry in fdt (henceforth Ill call file descriptor table by this name ) for the process will loose its reference to the file.This is done by simply writing:

file.close()

reading from a file can only be done if its opened in ‘r’ mode or related ones (and not ‘a’ and ‘w’).

Lines=file.readlines()

This reads out the file in a list, such that one line of the file goes to one element of the list.

However the more powerful command is :

content=file.read()

This reads out the entire contents of the file in a single string.

to write in a file, open a file in write mode (if the file is new then only, opening a file in write mode will erase its old contents, and create a new file , if the file doesn’t exists on disk). If you want to append data in a file, use the ‘a’ mode to open it. Now if you use the write() function, it adds String at the end of the file. 

file.write(<String>)

One more thing just like other languages the file offset automatically is set after the read and write, we dont need to set it manually. So when you read the file by readlines() or read() once and recall the function, they’ll not return anything, because the cursor in the file would be at the end and there would be nothing to read.

A better understanding can be achieved using the for..in construct on files, when you write ‘for line in file :’, the name line iterates through each line of the file executing the code in the block written inside the body of for everytime.Now you can think of this that lines are read one by one and still in order, so there is some cursor which is moving throughout the file as its read and would not read anything once it reaches the end of the file.

Now see the following code wherein Ill use the basic things in some file operations, for example counting the occurence of each word in a file.The file is a text file called myBlog.txt and its in the same directory as my Python code.Besides this I’m copying one file into the other with the lighter (with lesser characters) lines first and heavier one later. You’ll see a lot of file and dictionary use in it.

if __name__==’__main__’:
    print ‘les see how to work on files’
    f= open(‘me.txt’,’w’)
    #open() opens a file for us
    # This file is opened in writable mode, if its not on the disk,
    #a new one is created and given to us for use
    f2= open(‘myBlog.txt’,’r’)
    #This is a file containing instances of text from my blog ebliz.wordpress.com
    #now we first print the file line by line
    for line in f2:
        # for .. in iterates over lines in case of files
        print line
    li8=raw_input(‘Press Any Key to continue’)
    #If we want to do line by line analysis for some job
    #we can use the readlines() functionality
    #say arranging Lines in order of the number of characters
    #However since f2 is a file pointer, it would have reached the end of the
    # file and hence needs to be set back to starting
    f2.flush()
    f2.seek(0,0)
    for sortedLine in sorted(f2.readlines(),key=lambda x:x.__len__()):
        f.write(sortedLine)
    li8=raw_input(‘Press Any Key to continue’)
    #Ultimate Thing is when we want to do wordwise analysis
    #say count the occurence of each word
    # We can use the read() function
    #While readlines reads the file into a list of Strings containing Each Line as an
    # as an element of List, read() reads the entire file into a single string!!
    # So Now we are counting the occurence of each word
    wordCount={}
    f2.flush()
    f2.seek(0,0)
    for word in (f2.read()).split():
       wordCount[word]=wordCount.setdefault(word,0)+1

    for word in sorted(wordCount.keys(),key= lambda x:x[0]):
        print ‘%s occurs %d times’%(word,wordCount[word])
    f2.close()
    f.close()
   

Most of the above code is easy, (try seeing the comments). However Ill discuss the new things that I havent discussed till now.

First of all, the flush() and seek() functions.If you see the code carefully, you’ll see that its always used when we are done reading all the lines. This is obviously to set the file offset back to the first character.The flush file deletes the file buffer, and seek(0,0) brings the offset back to the start of the file.

The other way is the way in which I count the occurences of words in the file. What a normal person will do in this is that he’ll create a new List of words, and then linearly check each word for its past number of occurences and increment by one in each case.But what I do here is I hash each new word, and in case of second or a later occurence you  increment the value stored along with this hash. This is a shorter algorithm.

setdefault() call has been used on a dictionary, lets analyse what is the purpose of this one. Suppose we fetch out a word from the file, then we have hash it in our dictionary and see whether it has been encountered even once or not. If it has been encountered once or more, then hash table will give us how many times it has been encountered, and all we need to do is to increment this value. If the word is being encountered for the first time then, the hashed value for it will be nill in the dictionary, and we’ll have to set the value as 1.

dict.setdefault(key,defaultValue) does this dual job in one single statement. dict.setdefault(key,value) gives dict[key] if the value for the key has been set earlier. If the value hasn’t been set, then it performs dict[key]=defaultValue and returns defaultValue as the value of dict[key]. So the statement is same as dict[key]. However when dict[key] hasnt been given value, and then it will give dict[key] the defaultValue its expected to have on the first hash and then give dict[key].

I think the rest is clear and self explanatory. BTW, if you dint get what the title means, try google translating it from french to whatever language you want.

My first hack – how to use a dictionary

Last sem I had my Data Structures and algorithms course. I liked it a lot, but poor me ….Vi editor isnt something on which I am made to work upon (Seriously if you are a visual spatial learner and want to learn things, start by reading ‘Eclipse for dummies’ before you learn anything else). So I screwed up all my three hour (timed coding tests have been dooming me for last 3 sems) practical components and made a C. (Hate to say that, but its not your coding ability but your ability to code with least resources is what most T Schools test). However the context in which I am referring to my algorithms course is not trying to advice visual Spatial learners but rather to tell you about Hash Tables.Lets come to the point.

Hash Tables are awesome data structures . They help you do stuff which no other data Structure can. This job is to relate two values of any kind and store these related values in a single big data structure.Lets try and see an example, how will you implement a password check for say your mail server (something like gmail) ?

The simplest solution is : put all the user names and passwords on a huge List, and then linearly compare each element of this list with the data user gave and hence get the password match. This approach will work, but say i have a user name and Database list of millions of people what then ?? It will take hours to verify your username and passwords validity even once.(This is after you can put the List into RAM).

So you need Database systems. Database can access a data set from a set containing millions of them in almost constant time (ie the search time doesnt depend upon the number of things stored in Database). One of the ways to do things this way is by hash tables.

Hash is a quickly calculable mathematical function that takes something as input and returns a number based on it. All the versions I know of it work using Prime  Numbers. To know where a username and a password is stored you need to apply some hash function h(x) on your object and get the memory location you want out of it. So now you know the address of your required memory location very fast. What if two things hash to same number ? It will definitely increase the fetch time,but there are ways to make hash tables such that the real value is replaced by a minimum number of comparisons(the normal List approach) beyond this. This is by using another hash function to compute the next possible value which needs to be matched and so on. If you didn’t get it, leave it for later, implementation of a hash isnt something we are concerned about now.To console you Ill tell that lot of effort is made to make sure that the hash gives the correct result on just applying hash function, and a set of hash functions can be used to make this thing better.

So now that we have an idea of what hash is we’ll tend to work out what it actually is used for. Python uses a lot of hash for its own functioning as well as the codes use it a lot too. The way hashes are implemented in Python is in the form of a data structure called dictionaries.

Lets see how we work on a dictionary :

 
IDLE 2.6.4     
>>> a={‘name’:’Muktabh’}
>>> a[‘name’]
‘Muktabh’
>>> a={‘name’:’Muktabh’,’College’:’BITS Pilani’}
>>> a[‘name’]
‘Muktabh’
>>> a[‘College’]
‘BITS Pilani’
>>> a
{‘College’: ‘BITS Pilani’, ‘name’: ‘Muktabh’}
>>> #traversing through a dict
>>> for i in a:
 print ‘%s is %s’ %(i,a[i])

 
College is BITS Pilani
name is Muktabh
>>>

These were the very basic ops. The first ones show how to initialize a dictionary. The terms before the colons in  a={‘name’:’Muktabh’,’College’:’BITS Pilani’} are the keys of dictionary, ie ‘name’ and ‘college’ in this case are the keys. The keys are hashed (ie the hash function is applied on them) and the values are stored according to the hash values. Then after this we have  iteration over the dictionary. Clearly, the for..in iterates over the keys of the dictionary and not its value. Thus we need to access the values by dictName[key] .

IDLE 2.6.4     
>>> a={‘name’:’Muktabh’,’college’:’BITS Pilani’}
>>> a[‘favProgLang’]=’Python’
>>> a
{‘favProgLang’: ‘Python’, ‘college’: ‘BITS Pilani’, ‘name’: ‘Muktabh’}
>>> #added a new key value pair to the dictionary
>>> a.keys()
[‘favProgLang’, ‘college’, ‘name’]
>>> a.values()
[‘Python’, ‘BITS Pilani’, ‘Muktabh’]
>>> a.items()
[(‘favProgLang’, ‘Python’), (‘college’, ‘BITS Pilani’), (‘name’, ‘Muktabh’)]
>>> for k,v in a.items():
 print k,v

 
favProgLang Python
college BITS Pilani
name Muktabh
>>>
>>> b={‘fav actoress’:’katrina’,’fav movie’:’Rock n Rolla’}
>>> a.update(b)
>>> a
{‘favProgLang’: ‘Python’, ‘college’: ‘BITS Pilani’, ‘name’: ‘Muktabh’, ‘fav movie’: ‘Rock n Rolla’, ‘fav actoress’: ‘katrina’}
>>> # added one dictionary to other

The above codes show that Dictionary is mutable. There are two ways to add elements to a dictionary,first is using the update method which adds a new dict to the dict which calls it using (.) operator ie dictB is added to dictA on doing dictA.update(dictB), the other is directly doing a[newKey]=newVal.

Other important things are getting Lists of Keys,Values and (Key,Value) tuples using the methods someDict.keys(), someDict.values() and someDict.items().We can then iterate over them using for in.Also the two ways of adding into the dictionary can also be used to change values corresponding to keys.

IDLE 2.6.4     
>>> a={‘Name’:’Muktabh’,’college’:’BITS Pilani’}
>>> a[‘Name’]=’Mayank’
>>> a
{‘college’: ‘BITS Pilani’, ‘Name’: ‘Mayank’}
>>> b={‘college’:’IIT Bombay’,’Hobby’:’Guitar’}
>>> a.update(b)
>>> a
{‘Hobby’: ‘Guitar’, ‘college’: ‘IIT Bombay’, ‘Name’: ‘Mayank’}
>>>

I am ending this post with an easy code which makes you write down your first hack. With this code you will get all the data of your Operating System that it shares with any of the running processes on it.

import os
if __name__==’__main__’:
    for k,v in os.environ.items():
        print ‘\n’+k+’ is’+v

Many-File’d codes and the use of Tuples

Hello reader (if there’s a reader at all😛 ).. long tym since i came back o this blog of mine. i am sure that the coming posts are going to be more interesting than what has already passed (The last post was cool, I tried to write a cool as well as basic code.. hopefully😐 ). Anyways so here we are and we will talk of tuples today. Tuples are inmade data structures (like lists) which are very much like lists. But there are some fundamental differences that we’ll see now.

Defining and working on a tuple is easy, a () is used to surround the elements of tuple just like [] is used to surround the elements of list:

IDLE 2.6.4     
>>> myTuple=(‘a’,2.3,[‘hello m a list in a tuple’,2222])
>>> myTuple[0]
‘a’
>>> myTuple[3]

Traceback (most recent call last):
  File “<pyshell#2>”, line 1, in <module>
    myTuple[3]
IndexError: tuple index out of range
>>> myTuple[2]
[‘hello m a list in a tuple’, 2222]
>>> for i in myTuple:
 print i

 
a
2.3
[‘hello m a list in a tuple’, 2222]
>>> myTuple[2]=’hi’

Traceback (most recent call last):
  File “<pyshell#7>”, line 1, in <module>
    myTuple[2]=’hi’
TypeError: ‘tuple’ object does not support item assignment
>>> myTuple[2],myTuple[0]=myTuple[0],myTuple[2]

Traceback (most recent call last):
  File “<pyshell#8>”, line 1, in <module>
    myTuple[2],myTuple[0]=myTuple[0],myTuple[2]
TypeError: ‘tuple’ object does not support item assignment
>>>

For the first few steps, that is iterating through a tuple, define a tuple and accessing some element of tuple, the things were essentially very much like List. But, then we can see:

>>> myTuple[2]=’hi’

Traceback (most recent call last):
  File “<pyshell#7>”, line 1, in <module>
    myTuple[2]=’hi’
TypeError: ‘tuple’ object does not support item assignment

This is something that was very possible in List, but its not possible here. This shows a fundamental property of Tuples, they are immutable. Once a tuple is given its value cant be changed. Ill share with you a doubt that I had when I first read the term immutable (when I first read python some 2 years ago).

IDLE 2.6.4     
>>> a=(1,2)
>>> a=(6,7)
>>>

Didn’t we change the value of tuple here ?? The answer is no, we just made the name ‘a’ unpoint from (1,2) and point to a new tuple (6,7). This concept will be very soon discussed in an upcoming article, but for now just think of it that in Python (and in most other object oriented languages), the name we give to a value and the value itself (which is technically called object and which we’ll discuss very soon) are 2 different entities. Variable names like a are actually a space in memory which store the address of a value. So when we did a=(6,7) we just changed the address stored under the name a, which earlier held address of (1,2) and will now hold (6,7)’s address later. The value (or object) is untouched. However when we did myTuple[2]=’hi’, We wanted to change the 3rd location of the tuple (‘a’,2.3,[‘hello m a list in a tuple’,2222]) it wasnt allowed. This is immutability. If we do the same thing on a list there will be no problem with it, coz lists are mutable.

IDLE 2.6.4     
>>> myList=[1,2,3]
>>> myList[1]=4
>>> myList
[1, 4, 3]
>>>

Then since we know that Tuples are immutable and that their order cant be changed, how can we exploit the fact. Tuples are data structures which can be used to store things which are sensible only when,written and expressed, together and in a particular order.  say, the coordinates of a point (1,1,1) which are extremely important to be ordered and also the arguments to a function(please try to visualise why params of a function are good to be in a tuple, we’ll use the concept later).

Hmm so now we know how to make a tuple work. We’ll switch over to another concept. Suppose you are given a ‘special no seeing each other’s code allowed, test your coordination’ project where you are implementing the main part of the program, and you ask your friend to implement some functionality (ie you give him name of function names(and the params you’ll pass to the function) and their purpose and ask him to write the functions down.In such a case you’ll be required to write codes in two different files and then just run it in the front of judges. In most of the real time apps too, we need to write more than one file of code due to various reasons (and they cant be combined too, coz there will be thousands and millions of lines of code and you’ll need to put things on order they are defined and needed, see the post where we wrote a function and needed to put it before we put the statement calling it.)

To use functions (and other resources too) from another file in Python, we need to import the file as a module. This can be done using the import statement. The file which is being imported must be in the same directory as the file which is importing it or in a file from a directory whose path is stored in the variable PYTHONPATH in your computer.

PYTHONPATH is a system variable which can be changed using command line in Linux systems or using Control Panel in Windows. Normally PYTHONPATH has some entries like the . of linux, lib of the Python installation root directory etc. (This . means present working directory and this is what makes the module to be imported from the directory whose files uses the import statement).

A lot of modules have been written and given along with python (in the lib directory of your Python installation root), and we can use them directly.To know what all methods are in a file you can use the dir and the help functions.Lets c.

    ****************************************************************
   
IDLE 2.6.4     
>>> import math
>>> dir(math)
[‘__doc__’, ‘__name__’, ‘__package__’, ‘acos’, ‘acosh’, ‘asin’, ‘asinh’, ‘atan’, ‘atan2’, ‘atanh’, ‘ceil’, ‘copysign’, ‘cos’, ‘cosh’, ‘degrees’, ‘e’, ‘exp’, ‘fabs’, ‘factorial’, ‘floor’, ‘fmod’, ‘frexp’, ‘fsum’, ‘hypot’, ‘isinf’, ‘isnan’, ‘ldexp’, ‘log’, ‘log10’, ‘log1p’, ‘modf’, ‘pi’, ‘pow’, ‘radians’, ‘sin’, ‘sinh’, ‘sqrt’, ‘tan’, ‘tanh’, ‘trunc’]
>>> help(math.log10)
Help on built-in function log10 in module math:

log10(…)
    log10(x) -> the base 10 logarithm of x.

>>> log10(1000000)

Traceback (most recent call last):
  File “<pyshell#3>”, line 1, in <module>
    log10(1000000)
NameError: name ‘log10’ is not defined
>>> math.log10(1000000)
6.0
>>>

so dir gives the names of all the functions (actually its real purpose is a superset of this, Ill tell when the time is right).

even after importing the module you cant call any function in it just by the name of the function, but rather by moduleName.functionName(args) , as you can see above.

You can see the documentation written for a function by using the help function.

dir and help are two introspection statements in Python, which help to know about imported modules.Python has a pretty large collection of inbuilt functions and you cant remember format for each, so it’s utterly useful as to go for using these functions while writing Python code.

Now Ill write a module myself and use its functions to show how things actually are done.

Here we try to implement a very basic relational database system (kinda, coz you need thousands of people to develop database systems, you cant compare my code to them but anyways).

filename:listBasedRelationalDatabase.py

if __name__==’_main__’:
    pass
def getKeys(totalList):
    keys=[]
    for i in totalList:
        keys.append(i[0])
    return keys
def getDataForAKey(totalList,key):
    value=[]
    for i in totalList:
        if i[0]==key :
            value=i[1]
    return value
def printWholeRelation(totalList):
    for iTuple in totalList:
        print ‘for %s :’%(iTuple[0],)
        print ‘age is : %s id number is : %s’ %(iTuple[1][0],iTuple[1][1])

    return

Then I’ll use these functions in a code which implements the database:

import listBasedRelationalDatabase
if __name__==’__main__’:
    ‘”Uses the totalList as a relation (as in the one used in relatinal databases), and the data is stored in the form of a tuple (name,[list of attributes])”‘
    totalList=[]
    for i in xrange(int(raw_input(‘How many people need to be kept record of ?’))):
        totalList.append((raw_input(‘Enter The name’),[raw_input(‘Enter the age’),raw_input(‘Enter the\ id number’)]))
    response=int(raw_input(‘What do You want to see ?,Enter 1 to see the Keys, 2 to fetch the data\ related to some key and 3 to print out the whole data’))
    if response==1 :
        print listBasedRelationalDatabase.getKeys(totalList)
    elif response==2 :
        print listBasedRelationalDatabase.getDataForAKey(totalList,raw_input(‘Enter The Key’))
    elif response==3:
        print listBasedRelationalDatabase.printWholeRelation(totalList)
    
   
Ill give some explanation for the above code. First of all, as we already know both are .py (Python scripts), they both will run by the command python fileName.py .. So, thats the reason why we wrote if  __name__==’__main__’: thing in both the files. However for the file which is to be only used for importing functions, I just put pass statement in the __name__==’__main__’ . The nearest analogy for pass statement (for people who have other language experience) is : of bash and a pair of empty {} for java and C people, for people who are new to coding, it means do nothing. So this file will do nothing if you launch it as python listBasedRelationalDatabase.py .

Then there are a few functions which are quite easy to understand.Try seeing how I have taken a tuple whose elements are a String and a List containing 2 Strings in here: 

totalList.append((raw_input(‘Enter The name’),[raw_input(‘Enter the age’),raw_input(‘Enter the\ id number’)]))

I think thats all what is needed to be explained in the code above.

Pythonian Style de coding : lists and String

Python is a general purpose language, and thats why we can write any kind of code (say the sorting codes we just wrote).It format is very much like English pseudocode and thus is considered to be an ideal first computer language by many people.I am not a supporter of this view totally though. Python as a first language isnt good for someone into core Computer Science, C/Java combo is much better, as in my view to appreciate how cool something/someone is, you need to learn it the cool way and not the A..B..C way.For people who are just web developers or self taught Hackers, Python is best possible first language, because it spares them the tough C/Java syntax and lets them get straight to the point.

Python is also a scripting language(think of the cool things you can do on the command line, that too by one command) and so is destined to be cool (trust me there is not a cooler language except perhaps Ruby which has its own cool ways).Just like Shell there are lot of powerful commands and inbuilt functios in Python which make your job easier.

The Python developer isnt supposed normally to code insertion Sort all his life.When developing big apps, coding for small things like insertion sort isnt something you’ll want to do, the huge design will be more important . So things like sort and other basic algo come precoded in Python, so that you can concentrate on more important jobs and not keep reinventing the wheel.

This code shows how a developer will normally use Python: (Its a code that takes a String, places the world in the order of the number of characters they contain)

def len(s):
    return s.__len__()
if __name__==’__main__’:
    “‘This code takes a string as input and print it out such that the lightest (with least characters) words come first and the heaviest at the end.'”
    myString=raw_input(‘Enter the string’)
    words=myString.split()
    sortedWords=sorted(words,key=len)
    myString=’ ‘.join(sortedWords)
    print myString,’is the new String’

When writing this code normally, we would have to separate out a String into component words, then arrange it into order. A typical C code for doing it would take about 60 lines. The above code is much smaller, and just needs you to know the concepts in which this algorithm would be implemented and thats all.

Some new things which I have used in this code are :

the .split() fnnction has been called. I have abstracted why some functions need to be called using ‘.’ operator (like . __len__() etc ) and others not till now, but Ill be going indepth into it very soon. Just bear with me for a few posts.

The .split() method explodes String into a List.Since we didnt pass any arguments to it, it goes by the default ones treating all whitespaces as delimiters to split the String.So we’ll get each word of the String as an element of the returned List. However if some different String is given as parameter to the function split,say someString.split(delimiterString) then someString is  split according to the delimiterString given.

The <someString>.join(someList) is the exact reverse of the split function. This is used to  implode a list into a String. The List elements are joined with the <someString> String between them.Lets see examples and clarify these things even more.

IDLE 2.6.4     
>>> “my name is Muktabh”.split()
[‘my’, ‘name’, ‘is’, ‘Muktabh’]
>>> for i in ‘my name is Muktabh’.split():
 print i

 
my
name
is
Muktabh
>>> ‘heehehhehhheh’.split(‘e’)
[‘h’, ”, ‘h’, ‘hh’, ‘hhh’, ‘h’]
>>> ‘heehehhehhheh’.split(‘hh’)
[‘heehe’, ‘e’, ‘heh’]
>>> ‘ ‘.join(range(9))

Traceback (most recent call last):
  File “<pyshell#6>”, line 1, in <module>
    ‘ ‘.join(range(9))
TypeError: sequence item 0: expected string, int found
>>> ‘elixir’.join([‘goa’,’was’,’fun’,’and’,’so’,’were’,’the’,’drinks’,’there’,’;)’])
‘goaelixirwaselixirfunelixirandelixirsoelixirwereelixirtheelixirdrinkselixirthereelixir;)’
>>> ‘<press tab key of your keyboard> ‘.join([‘goa’,’was’,’fun’,’and’,’so’,’were’,’the’,’drinks’,’there’,’;)’])
‘goa\twas\tfun\tand\tso\twere\tthe\tdrinks\tthere\t;)’
>>> print ‘<press tab key of your keyboard> ‘.join([‘goa’,’was’,’fun’,’and’,’so’,’were’,’the’,’drinks’,’there’,’;)’])
goa     was     fun     and     so     were     the     drinks     there      ;)
>>>

The error was thrown because a non-string list was given as an argument to join, so be careful that Python doesnt convert a non String list to list of Strings.

The rest is self explanatory.

One more thing that deserves mention here is the sorted method.We have already used it but here we use it as :

sortedWords=sorted(words,key=len)

What does key=len mean ??

well this is a way to pass argument in python to some parameter of a function which has more than one parameters with default arguments, and we want to enter just one (or lets say not all) the default argument containing parameter. There is a high probility that the above lines didnt make any sense so here is an example to clarify.

def myFunc(param1,param2=[],param3=’Muktabh’):

 <someCode> 

now you want to call myFunc such that you want to pass argument only for param1(which you have to pass anytime you call the function anyways), and for param3, but not param2.

so we’ll do myFunc(True,param3=’pythonGod😛’) thats all we did in the call to sorted there. However we passed len as an argument, which is a function.How can we pass function as argument ?? Just bear with me for now, and think that its somehow passed,we’ll soon cover everything.

so by giving key=len we said that the List should be sorted according to the return values of various elements to the len function. If we didn’t supply it, it would have worked some other way.

IDLE 2.6.4     
>>> sorted(‘my name is Itachi’.split())
[‘Itachi’, ‘is’, ‘my’, ‘name’]
>>>

thats because some default parameter is there for the key. Hope its clear.

But still the code written above is long according to ‘Python cool’ standards.See this one doing the same job as above.

if __name__==’__main__’:
    print ‘ ‘.join(sorted(raw_input(‘Enter The String’).split(),key=lambda x:x.__len__())),’is the new\ String’

Here you are witnessing the extreme power of Python.A code which would have taken some 50 lines in c takes just one line in Python (remove the \ and bring the remaining part of code in same line when executing). Another thing is that its all yours to interpret, because its almost the same as above code with just two complexities that Ill discuss next.

The first is the \ sign at the end of second line. If you’ll carefully see its actually a character telling that the next line is the continuation of the same line, thus completing the String,which couldn’t be completed in the same line.I dunno whether there is any use of writing the logic behind it but anyways Ill put it. \ has a special significance for Python.

\t is tab, \n is new line and so on. The character generated on pressing the return key is the statement terminator for Python and since here you put a \ before pressing a return , thus making the carriage return lose its significance.I hadn’t done it in my code but since WordPress autoformatted my code, I decided to use it to keep my code safe from any error that would occur if someone copy pasted the code.

What next ?? haan yes the lambda functions. These are one line functions which function faster than a normal function defined using def.

Syntax: lambda x : <return expression>

this is actually saying f(x){<return Expression>}

Check how I have used it in the above code, and obviously try to figure code out on your own.

Keep watching out for cool Python stuff here on my blog.
   

Using Lists (Insertion Sort On Lists)

Here We’ll see how to use lists to perform a popular algorithm called Insertion sort. For people who dont know about sorting or insertion sort , try reading Wikipedia article http://en.wikipedia.org/wiki/Sorting_algorithm and more links given there.

if __name__==’__main__’:
    print ‘sorting ten numbers’
    dumb=raw_input(‘Press Enter to Continue’)
    numbers=[]
    for i in xrange(10):
        numbers.append(int(raw_input(‘Enter the %d th number’ % (i+1,))))
    i=0
    while (i<10):
        j=0
        while(j<i):
            if(numbers[i]
                numbers[i],numbers[j]=numbers[j],numbers[i]
            j+=1
        i+=1
    print ‘sorted numbers’,numbers

The above code is for insertion sorting 10 numbers, Lets see how it works.

First of all we see the :

dumb=raw_input(‘Press Enter to Continue’)

This line is basically the MATLAB way to make the code halt till the user is ready. raw_input takes a String as input and saves it to the variable it returns value to. Here we force user to give a voluntary return (which signifies the end of input for the raw_input function) to make sure he’s ready before we throw input requests at him.The variable dumb serves no purpose.

The next is initialising a variable ‘numbers’ with a blank List so that we can append elements to it in each iteration.

After this we take the numbers input from users (the numbers we are going to sort).

for i in xrange(10):
        numbers.append(int(raw_input(‘Enter the %d th number’ % (i+1,))))

for..in construct works in a way such that each variable of a mentioned list is iterated through. Now to make it run through a range of numbers we use inbuilt functions like ‘range’ and ‘xrange’.

IDLE 2.6.4     
>>> for i in xrange(10):
 print i

 
0
1
2
3
4
5
6
7
8
9
>>>

so why did I choose to use xrange over range above ? range generates a proper list (of integers from 0 to n-1 if range(n) is called) for the loop and then for in iterates through it (anyone would say that making a list to just to iterate through numbers is a waste of time and memory ), xrange on the other hand uses the internals of designing of python to avoid making such a list. Basically when a List is created in Python, something (We’ll see it in a lot of detail) related to it called iterator is created automatically and this iterator is used to run through all the list elements and not the list.xrange generates this iterator without using a real list and is resouce wise efficient. Future versions of Python, ie 3000 and above wont use the present day range at all, they’ll use present day xrange function as range rather.

After that , we take some number as input, which comes in as a string (raw_input inputs strings only), so we use the built in function int() to get the integer value which is possibly in string. If no such value was found, we would have had an error.

Such quick data type conversion functions like str() and int() are called hooks.

Each of the numbers given as input is appended to the list numbers and then its sorted according to the normal insertion sort algorithm.

ONLINE INSERTION SORT :

Normally insertion sort isnt a very good alternative to do sorting upon an array, which is totally available (say for example in the above code we first had the entire list constructed before we sorted). The actual application of this algo is online sorting, ie putting an element in place as it should be after sorting, as soon as element is entered. (ie, say the list is 34,67,98 and 44 is entered as the 4th element, its added between 34 and 67).

Here’s the code:

def appendInRightPlace(numbers,newNumber):
    numbers.append(newNumber)
    for i in range(numbers.__len__()-1):
        if numbers[i] > newNumber :
            numbers=numbers[0:i]+[numbers[numbers.__len__()-1]]+numbers[i:-2]
            return
    return #Optional, makes the program look one line longer
       
if __name__==’__main__’:
    print ‘sorting n numbers’
    n=int(raw_input(‘how many numbers ?’))
    dumb=raw_input(‘Press Enter to Continue’)
    numbers=[]
    for i in xrange(n):
        appendInRightPlace(numbers,int(raw_input(‘Enter the %d th number’ % (i+1,))))
    print ‘sorted numbers’,numbers

Lets see this code line by line as to what is being used here:

We’ll start with the __name__  as  __main__   part, the first line which can seem to be unfamiliar here is :

appendInRightPlace(numbers,int(raw_input(‘Enter the %d th number’ % (i+1,))))

Here we are passing the list numbers and the input converted to int to the function we have already defined, appendInRightPlace. Since the function doesnt return any value (something which has no value has the value None in python so it returns None), we dont need to write explicit returns denoting end of function or use a variable to store its result.

but another new thing is the argument to raw_input here, what does the phrase ‘Enter the %d th number’ % (i+1,) mean ?

A C coder will immediately relate to it seeing the %d sign, and it has exactly the same significance as in C.For people who dont know %d(for integers) and %s(for strings) type of characters in String become place holders if the String is immediately followed by a % sign. That is Python knows that these constructs will be replaced by things written in () after the ending  % sign.

See the following example on command prompt as how this can make a String dynamic..

  
IDLE 2.6.4     
>>> “hi %s who’s a %s God” % (‘Muktabh’,’Pythonian’)
“hi Muktabh who’s a Pythonian God”
>>> “hi %s who’s a %s God” % (‘Sasuke’,’Uchiha Ninja’)
“hi Sasuke who’s a Uchiha Ninja God”
>>>

Think of the places like SQL queries etc where you can find such format of dynamic Strings useful.

Now in the function appendInRightPlace , everything is simple implementation of the insertion of insertion sort (The new element is appended at the end and then moved to its original place) except that a special approach of python called Splicing has been used in

numbers=numbers[0:i]+[numbers[numbers.__len__()-1]]+numbers[i:-2]
So what does the above command mean ?

The construct <anyList[m:n]> is called list splicing in Python. This returns a new list with same elements as m through n-1 elements of .Lets see the following examples of splicing:

IDLE 2.6.4     
>>> a=[‘Muktabh’,90,89.90,”kewl”,’2008C6PS658′]
>>> a[3:5]
[‘kewl’, ‘2008C6PS658’]
>>> a[1:3]
[90, 89.900000000000006]
>>> a[:2]
[‘Muktabh’, 90]
>>> a[2:]
[89.900000000000006, ‘kewl’, ‘2008C6PS658’]
>>> a[-4:-2]
[90, 89.900000000000006]
>>> a[-2:-4]
[]
>>> a[-2:-4:-1]
[‘kewl’, 89.900000000000006]
>>> a[1,3,-1]

[]
>>> a[-1:-5:-2]
[‘2008C6PS658’, 89.900000000000006]
>>>

The third argument used in some of the cases is the step of splicing, ie anyList[m:n:l] should copy the mth element of anyList followed by the lth next element till it reaches n-1th element into a new List and return it.Default l is 1(ob).

When Splicing Lists by the given method isnt possible a blank List [] is returned.

for eg a[-2:-4] will give a [] as there is no way to reach fourth last character from the second last character moving one character ahead.

Next is the concatenation of lists that was used in the same step. a + or * also works when applied to lists. You can concatenate two or more Lists with the + or repeat a list with the * operator, lets see examples ..

IDLE 2.6.4     
>>> a=[‘hi’,’God’,’of’,’peace’]
>>> b=[‘you’,’should’,’be’,’pain’]
>>> c=a+b
>>> c
[‘hi’, ‘God’, ‘of’, ‘peace’, ‘you’, ‘should’, ‘be’, ‘pain’]
>>> c= a*3
>>> c
[‘hi’, ‘God’, ‘of’, ‘peace’, ‘hi’, ‘God’, ‘of’, ‘peace’, ‘hi’, ‘God’, ‘of’, ‘peace’]
>>> c+=b
>>> c
[‘hi’, ‘God’, ‘of’, ‘peace’, ‘hi’, ‘God’, ‘of’, ‘peace’, ‘hi’, ‘God’, ‘of’, ‘peace’, ‘you’, ‘should’, ‘be’, ‘pain’]
>>>

The rest the code does is the algo for online sorting.
I put the new number at the end of list(which will otherwise be sorted) and then put this number at its appropriate place thus moving all numbers larger than it one location ahead.

Follow

Get every new post delivered to your Inbox.