Python Programming Language |
|
Last changed on Mon Feb 12 16:05:28 2001 EST
(Entries marked with ** were changed within the last 24 hours; entries marked with * were changed within the last 7 days.)
To find out more, the best thing to do is to start reading the tutorial from the documentation set (see a few questions further down).
See also question 1.17 (what is Python good for).
By now I don't care any more whether you use a Python, some other snake, a foot or 16-ton weight, or a wood rat as a logo for Python!
It is a gzipped tar file containing the complete C source, LaTeX documentation, Python library modules, example programs, and several useful pieces of freely distributable software. This will compile and run out of the box on most UNIX platforms. (See section 7 for non-UNIX information.)
Older versions of Python, including Python 1.6 and Python 1.5.2, are also available from python.org.
The LaTeX source for the documentation is part of the source distribution. If you don't have LaTeX, the latest Python documentation set is available, in various formats like postscript and html, by anonymous ftp - visit the above URL for links to the current versions.
PostScript for a high-level description of Python is in the file nluug-paper.ps (a separate file on the ftp site).
USA:
ftp://ftp.python.org/pub/python/ ftp://gatekeeper.dec.com/pub/plan/python/ ftp://ftp.uu.net/languages/python/ ftp://ftp.wustl.edu/graphics/graphics/sgi-stuff/python/ ftp://ftp.sterling.com/programming/languages/python/ ftp://uiarchive.cso.uiuc.edu/pub/lang/python/ ftp://ftp.pht.com/mirrors/python/python/ ftp://ftp.cdrom.com/pub/python/Europe:
ftp://ftp.cwi.nl/pub/python/ ftp://ftp.funet.fi/pub/languages/python/ ftp://ftp.sunet.se/pub/lang/python/ ftp://unix.hensa.ac.uk/mirrors/uunet/languages/python/ ftp://ftp.lip6.fr/pub/python/ ftp://sunsite.cnlab-switch.ch/mirror/python/ ftp://ftp.informatik.tu-muenchen.de/pub/comp/programming/languages/python/Australia:
ftp://ftp.dstc.edu.au/pub/python/
More info about the newsgroup and mailing list, and about other lists, can be found at http://www.python.org/psa/MailingLists.html.
Archives of the newsgroup are kept by Deja News and accessible through the "Python newsgroup search" web page, http://www.python.org/search/search_news.html. This page also contains pointer to other archival collections.
You can also search online bookstores for "Python" (and filter out the Monty Python references; or perhaps search for "Python" and "language").
Most publications about Python are collected on the Python web site:
http://www.python.org/doc/Publications.htmlIt is no longer recommended to reference this very old article by Python's author:
Guido van Rossum and Jelke de Boer, "Interactively Testing Remote Servers Using the Python Programming Language", CWI Quarterly, Volume 4, Issue 4 (December 1991), Amsterdam, pp 283-303.
Alpha, beta and release candidate versions have an additional suffixes. The suffix for an alpha version is "aN" for some small number N, the suffix for a beta version is "bN" for some small number N, and the suffix for a release candidate version is "cN" for some small number N.
Note that (for instance) all versions labeled 2.0aN precede the versions labeled 2.0bN, which precede versions labeled 2.0cN, and those precede 2.0.
As a rule, no changes are made between release candidates and the final release unless there are show-stopper bugs.
In particular, if you honor the copyright rules, it's OK to use Python for commercial use, to sell copies of Python in source or binary form, or to sell products that enhance Python or incorporate Python (or part of it) in some form. I would still like to know about all commercial use of Python!
I had extensive experience with implementing an interpreted language in the ABC group at CWI, and from working with this group I had learned a lot about language design. This is the origin of many Python features, including the use of indentation for statement grouping and the inclusion of very-high-level data types (although the details are all different in Python).
I had a number of gripes about the ABC language, but also liked many of its features. It was impossible to extend the ABC language (or its implementation) to remedy my complaints -- in fact its lack of extensibility was one of its biggest problems. I had some experience with using Modula-2+ and talked with the designers of Modula-3 (and read the M3 report). M3 is the origin of the syntax and semantics used for exceptions, and some other Python features.
I was working in the Amoeba distributed operating system group at CWI. We needed a better way to do system administration than by writing either C programs or Bourne shell scripts, since Amoeba had its own system call interface which wasn't easily accessible from the Bourne shell. My experience with error handling in Amoeba made me acutely aware of the importance of exceptions as a programming language feature.
It occurred to me that a scripting language with a syntax like ABC but with access to the Amoeba system calls would fill the need. I realized that it would be foolish to write an Amoeba-specific language, so I decided that I needed a language that was generally extensible.
During the 1989 Christmas holidays, I had a lot of time on my hand, so I decided to give it a try. During the next year, while still mostly working on it in my own time, Python was used in the Amoeba project with increasing success, and the feedback from colleagues made me add many early improvements.
In February 1991, after just over a year of development, I decided to post to USENET. The rest is in the Misc/HISTORY file.
The two main reasons to use Python are:
- Portable - Easy to learnThe three main reasons to use Python are:
- Portable - Easy to learn - Powerful standard library(And nice red uniforms.)
And remember, there is no rule six.
In the area of basic text manipulation core Python (without any non-core extensions) is easier to use and is roughly as fast as just about any language, and this makes Python good for many system administration type tasks and for CGI programming and other application areas that manipulate text and strings and such.
When augmented with standard extensions (such as PIL, COM, Numeric, oracledb, kjbuckets, tkinter, win32api, etc.) or special purpose extensions (that you write, perhaps using helper tools such as SWIG, or using object protocols such as ILU/CORBA or COM) Python becomes a very convenient "glue" or "steering" language that helps make heterogeneous collections of unrelated software packages work together. For example by combining Numeric with oracledb you can help your SQL database do statistical analysis, or even Fourier transforms. One of the features that makes Python excel in the "glue language" role is Python's simple, usable, and powerful C language runtime API.
Many developers also use Python extensively as a graphical user interface development aide.
http://www.python.org/ftp/python/src/py152.tgz
http://www.python.org/emacs/python-mode/index.htmlThere are many other choices, for Unix, Windows or Macintosh. Richard Jones compiled a table from postings on the Python newsgroup:
http://www.bofh.asn.au/~richard/editors.htmlSee also FAQ question 7.10 for some more Mac and Win options.
If you've never programmed before, you might try http://members.xoom.com/alan_gauld/tutor/tutindex.htm , a programming tutorial that attempts to teach programming and Python simultaneously.
The Python tutor mail list was also created for those new to Python and for those new to programming in general. See http://www.python.org/mailman/listinfo/tutor
Jacek Artymiak has created a Python Users Counter; you can see the current count by visiting http://www.wszechnica.safenet.pl/cgi-bin/checkpythonuserscounter.py (this will not increment the counter; use the link there if you haven't added yourself already). Most Python users appear not to have registered themselves.
Another statistic is the number of accesses to the Python WWW server. Have a look at http://www.python.org/stats/.
At CNRI (Python's new home), we have written two large applications: Grail, a fully featured web browser (see http://grail.cnri.reston.va.us), and the Knowbot Operating Environment, a distributed environment for mobile code.
The University of Virginia uses Python to control a virtual reality engine. See http://alice.cs.cmu.edu.
The ILU project at Xerox PARC can generate Python glue for ILU interfaces. See ftp://ftp.parc.xerox.com/pub/ilu/ilu.html. ILU is a free CORBA compliant ORB which supplies distributed object connectivity to a host of platforms using a host of languages.
Mark Hammond and Greg Stein and others are interfacing Python to Microsoft's COM and ActiveX architectures. This means, among other things, that Python may be used in active server pages or as a COM controller (for example to automatically extract from or insert information into Excel or MSAccess or any other COM aware application). Mark claims Python can even be a ActiveX scripting host (which means you could embed JScript inside a Python application, if you had a strange sense of humor). Python/AX/COM is distributed as part of the PythonWin distribution.
The University of California, Irvine uses a student administration system called TELE-Vision written entirely in Python. Contact: Ray Price [email protected].
The Melbourne Cricket Ground (MCG) in Australia (a 100,000+ person venue) has it's scoreboard system written largely in Python on MS Windows. Python expressions are used to create almost every scoring entry that appears on the board. The move to Python/C++ away from exclusive C++ has provided a level of functionality that would simply not have been viable otherwise.
See also the next question.
Note: this FAQ entry is really old. See http://www.python.org/psa/Users.html for a more recent list.
Also see peps/ for proposals.
See peps/pep-0005.html for the proposed mechanism for creating backwards-incompatibilities.
Please note that the PSA is now obsolete. There is no need to join it now.
Since Python is available free of charge, there are no absolute guarantees. If there are unforeseen problems, liability is the user's rather than the developers', and there is nobody you can sue for damages.
Python does few date manipulations, and what it does is all based on the Unix representation for time (even on non-Unix systems) which uses seconds since 1970 and won't overflow until 2038.
It is still common to start students with a procedural (subset of a) statically typed language such as Pascal, C, or a subset of C++ or Java. I think that students may be better served by learning Python as their first language. Python has a very simple and consistent syntax and a large standard library. Most importantly, using Python in a beginning programming course permits students to concentrate on important programming skills, such as problem decomposition and data type design.
With Python, students can be quickly introduced to basic concepts such as loops and procedures. They can even probably work with user-defined objects in their very first course. They could implement a tree structure as nested Python lists, for example. They could be introduced to objects in their first course if desired. For a student who has never programmed before, using a statically typed language seems unnatural. It presents additional complexity that the student must master and slows the pace of the course. The students are trying to learn to think like a computer, decompose problems, design consistent interfaces, and encapsulate data. While learning to use a statically typed language is important, it is not necessarily the best topic to address in the students' first programming course.
Many other aspects of Python make it a good first language. Python has a large standard library (like Java) so that students can be assigned programming projects very early in the course that do something. Assignments aren't restricted to the standard four-function calculator and check balancing programs. By using the standard library, students can gain the satisfaction of working on realistic applications as they learn the fundamentals of programming. Using the standard library also teaches students about code reuse.
Python's interactive interpreter also enables students to test language features while they're programming. They can keep a window with the interpreter running while they enter their programs' source in another window. If they can't remember the methods for a list, they can do something like this:
>>> L = [] >>> dir(L) ['append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort'] >>> print L.append.__doc__ L.append(object) -- append object to end >>> L.append(1) >>> L [1]With the interpreter, documentation is never far from the student as he's programming.
There are also good IDEs for Python. Guido van Rossum's IDLE is a cross-platform IDE for Python that is written in Python using Tk. There is also a Windows specific IDE called PythonWin. Emacs users will be happy to know that there is a very good Python mode for Emacs. All of these programming environments provide syntax highlighting, auto-indenting, and access to the interactive interpreter while coding. For more information about IDEs, see XXX.
If your department is currently using Pascal because it was designed to be a teaching language, then you'll be happy to know that Guido van Rossum designed Python to be simple to teach to everyone but powerful enough to implement real world applications. Python makes a good language for first time programmers because that was one of Python's design goals. There are papers at http://www.python.org/doc/essays/ on the Python website by Python's creator explaining his objectives for the language. One that may interest you is titled "Computer Programming for Everybody" http://www.python.org/doc/essays/cp4e.html
If you're seriously considering Python as a language for you school, Guido van Rossum may even be willing to correspond with you about how the language would fit in your curriculum. See http://www.python.org/doc/FAQ.html#2.2 for examples of Python's use in the "real world."
While Python, its source code, and its IDEs are are freely available, this consideration should not be weighed too heavily. There are other free languages (Java, free C compilers), and many companies are willing to waive some or all of their fees for student programming tools if it guarantees that a whole graduating class will know how to use their tools. It may also be comforting to know that Python is under the stewardship of CNRI (Corporation for National Research Initiatives), a non-profit whose mission is to foster research and development for the National Information Infrastructure. It "undertakes, fosters, and promotes research in the public interest."
While Python jobs may not be as prevalent as C/C++/Java jobs, teachers should not worry about teaching students critical job skills in their first course. The skills that win students a job are those they learn in their senior classes and internships. Their first programming courses are there to lay a solid foundation in programming fundamentals. The primary question in choosing the language for such a course should be which language permits the students to learn this material without hindering or limiting them.
Another argument for Python is that there are many tasks for which something like C++ is overkill. That's where languages like Python, Perl, Tcl, and Visual Basic thrive. It's critical for students to know something about these languages. (Every employer for whom I've worked used at least one such language.) Of the languages listed above, Python probably makes the best language in a programming curriculum since its syntax is simple, consistent, and not unlike other languages (C/C++/Java) that are probably in the curriculum. By starting students with Python, a department simultaneously lays the foundations for other programming courses and introduces students to the type of language that is often used as a "glue" language. As an added bonus, Python can be used to interface with Microsoft's COM components (thanks to Mark Hammond). There is also a JPython that can be used to connect Java components.
If you currently start students with Pascal or C/C++ or Java, you may be worried they will have trouble learning a statically typed language after starting with Python. I think that this fear most often stems from the fact that the teacher started with a statically typed language, and we tend to like to teach others in the same way we were taught. In reality, the transition from Python to one of these other languages is quite simple.
To motivate a statically typed language such as C++, begin the course by explaining that unlike Python, their first language, C++ is compiled to a machine dependent executable. Explain that the point is to make a very fast executable. To permit the compiler to make optimizations, programmers must help it by specifying the "types" of variables. By restricting each variable to a specific type, the compiler can reduce the book-keeping it has to do to permit dynamic types. The compiler also has to resolve references at compile time. Thus, the language gains speed by sacrificing some of Python's dynamic features. Then again, the C++ compiler provides type safety and catches many bugs at compile time instead of run time (a critical consideration for many commercial applications). C++ is also designed for very large programs where one may want to guarantee that others don't touch an object's implementation. C++ provides very strong language features to separate an object's implementation from its interface. Explain why this separation is a good thing.
The first day of a C++ course could then be a whirlwind introduction to what C++ requires and provides. The point here is that after a semester or two of Python, students are hopefully competent programmers. They know how to handle loops and write procedures. They've also worked with objects, thought about the benefits of consistent interfaces, and used the technique of subclassing to specialize behavior. Thus, a whirlwind introduction to C++ could show them how objects and subclassing looks in C++. The potentially difficult concepts of object-oriented design were taught without the additional obstacles presented by a language such as C++ or Java. When learning one of these languages, the students would already understand the "road map." They understand objects; they would just be learning how objects fit in a statically typed languages. Language requirements and compiler errors that seem unnatural to beginning programmers make sense in this new context. Many students will find it helpful to be able to write a fast prototype of their algorithms in Python. Thus, they can test and debug their ideas before they attempt to write the code in the new language, saving the effort of working with C++ types for when they've discovered a working solution for their assignments. When they get annoyed with the rigidity of types, they'll be happy to learn about containers and templates to regain some of the lost flexibility Python afforded them. Students may also gain an appreciation for the fact that no language is best for every task. They'll see that C++ is faster, but they'll know that they can gain flexibility and development speed with a Python when execution speed isn't critical.
If you have any concerns that weren't addressed here, try posting to the Python newsgroup. Others there have done some work with using Python as an instructional tool. Good luck. We'd love to hear about it if you choose Python for your course.
import test.autotestIn 1.4 or earlier, use
import autotestThe test set doesn't test all features of Python, but it goes a long way to confirm that Python is actually working.
NOTE: if "make test" fails, don't just mail the output to the newsgroup -- this doesn't give enough information to debug the problem. Instead, find out which test fails, and run that test manually from an interactive interpreter. For example, if "make test" reports that test_spam fails, try this interactively:
import test.test_spamThis generally produces more verbose output which can be diagnosed to debug the problem.
#! /usr/local/bin/python --You can also use this interactively:
python -- script.py [options]Note that a working getopt implementation is provided in the Python distribution (in Python/getopt.c) but not automatically used.
#readline readline.c -lreadline -ltermcap
in Modules/Setup. The configuration option --with-readline is no longer supported, at least in Python 2.0. Some hints on building and using the readline library: On SGI IRIX 5, you may have to add the following to rldefs.h:
#ifndef sigmask #define sigmask(sig) (1L << ((sig)-1)) #endifOn some systems, you will have to add #include "rldefs.h" to the top of several source files, and if you use the VPATH feature, you will have to add dependencies of the form foo.o: foo.c to the Makefile for several values of foo. The readline library requires use of the termcap library. A known problem with this is that it contains entry points which cause conflicts with the STDWIN and SGI GL libraries. The STDWIN conflict can be solved by adding a line saying '#define werase w_erase' to the stdwin.h file (in the STDWIN distribution, subdirectory H). The GL conflict has been solved in the Python configure script by a hack that forces use of the static version of the termcap library. Check the newsgroup gnu.bash.bug news:gnu.bash.bug for specific problems with the readline library (I don't read this group but I've been told that it is the place for readline bugs).
*shared*in the Setup file. Shared library code must be compiled with "-fpic". If a .o file for the module already exist that was compiled for static linking, you must remove it or do "make clean" in the Modules directory.
[Andy Dustman] On glibc systems (i.e. RedHat 5.0+), LinuxThreads is obsoleted by POSIX threads (-lpthread). If you upgraded from an earlier RedHat, remove LinuxThreads with "rpm -e linuxthreads linuxthreads-devel". Then run configure using --with-thread as above.
For Windows, see question 7.11.
(Also note that when compiling unpatched Python 1.5.1 against Tcl/Tk 7.6/4.2 or older, you get an error on Tcl_Finalize. See the 1.5.1 patch page at http://www.python.org/1.5/patches-1.5.1/.)
There's also a 64-bit bugfix for Tcl/Tk; see
http://grail.cnri.reston.va.us/grail/info/patches/tk64bit.txt
set PYTHONPATH=c:\python;c:\python\lib;c:\python\scripts
(assuming Python was installed in c:\python)
This shows up when building DLL under MSVC. There's two ways to address this: either compile the module as C++, or change your code to something like:
statichere PyTypeObject bstreamtype = { PyObject_HEAD_INIT(NULL) /* must be set by init function */ 0, "bstream", sizeof(bstreamobject),
...
void initbstream() { /* Patch object type */ bstreamtype.ob_type = &PyType_Type; Py_InitModule("bstream", functions); ... }
% python script.py ...some output... % python script.py >file % cat file % # no output % python script.py | cat % # no output %Nobody knows what causes this, but it is apparently a Linux bug. Most Linux users are not affected by this.
There's at least one report of someone who reinstalled Linux (presumably a newer version) and Python and got rid of the problem; so this may be the solution.
File "<stdin>", line 1 import sys ^ Syntax Error: "invalid syntax"Did you compile it yourself? This usually is caused by an incompatibility between libc 5.4.x and earlier libc's. In particular, programs compiled with libc 5.4 give incorrect results on systems which had libc 5.2 installed because the ctype.h file is broken. In this case, Python can't recognize which characters are letters and so on. The fix is to install the C library which was used when building the binary that you installed, or to compile Python yourself. When you do this, make sure the C library header files which get used by the compiler match the installed C library.
[adapted from an answer by Martin v. Loewis]
PS [adapted from Andreas Jung]: If you have upgraded to libc 5.4.x, and the problem persists, check your library path for an older version of libc. Try to clean update libc with the libs and the header files and then try to recompile all.
>>> from Tkinter import * >>> root = Tk() XIO: fatal IO error 0 (Unknown error) on X server ":0.0" after 45 requests (40 known processed) with 1 events remaining.The reason is that the default Xlib is not built with support for threads. If you rebuild Xlib with threads enabled the problems go away. Alternatively, you can rebuild Python without threads ("make clean" first!).
(Disclaimer: this is from memory.)
python >>> import _tkinter >>> import Tkinter >>> Tkinter._test()This should pop up a window with two buttons, one "Click me" and one "Quit".
If the first statement (import _tkinter) fails, your Python installation probably has not been configured to support Tcl/Tk. On Unix, if you have installed Tcl/Tk, you have to rebuild Python after editing the Modules/Setup file to enable the _tkinter module and the TKPATH environment variable.
It is also possible to get complaints about Tcl/Tk version number mismatches or missing TCL_LIBRARY or TK_LIBRARY environment variables. These have to do with Tcl/Tk installation problems.
A common problem is to have installed versions of tcl.h and tk.h that don't match the installed version of the Tcl/Tk libraries; this usually results in linker errors or (when using dynamic loading) complaints about missing symbols during loading the shared library.
On Unix, if you have enabled the readline module (i.e. if Emacs-style command line editing and bash-style history works for you), you can add this by importing the undocumented standard library module "rlcompleter". When completing a simple identifier, it completes keywords, built-ins and globals in __main__; when completing NAME.NAME..., it evaluates (!) the expression up to the last dot and completes its attributes.
This way, you can do "import string", type "string.", hit the completion key twice, and see the list of names defined by the string module.
Tip: to use the tab key as the completion key, call
readline.parse_and_bind("tab: complete")You can put this in a ~/.pythonrc file, and set the PYTHONSTARTUP environment variable to ~/.pythonrc. This will cause the completion to be enabled whenever you run Python interactively.
Notes (see the docstring for rlcompleter.py for more information):
* The evaluation of the NAME.NAME... form may cause arbitrary application defined code to be executed if an object with a __getattr__ hook is found. Since it is the responsibility of the application (or the user) to enable this feature, I consider this an acceptable risk. More complicated expressions (e.g. function calls or indexing operations) are not evaluated.
* GNU readline is also used by the built-in functions input() and raw_input(), and thus these also benefit/suffer from the complete features. Clearly an interactive application can benefit by specifying its own completer function and using raw_input() for all its input.
* When stdin is not a tty device, GNU readline is never used, and this module (and the readline module) are silently inactive.
It's just a nightmare to get this to work on all different platforms. Shared library portability is a pain. And yes, I know about GNU libtool -- but it requires me to use its conventions for filenames etc, and it would require a complete and utter rewrite of all the makefile and config tools I'm currently using.
In practice, few applications embed Python -- it's much more common to have Python extensions, which already are shared libraries. Also, serious embedders often want total control over which Python version and configuration they use so they wouldn't want to use a standard shared library anyway. So while the motivation of saving space when lots of apps embed Python is nice in theory, I doubt that it will save much in practice. (Hence the low priority I give to making a shared library.)
For Linux systems, the simplest method of producing libpython1.5.so seems to be (originally from the Minotaur project web page, http://mini.net/pub/ts2/minotaur.html):
make distclean ./configure make OPT="-fpic -O2" mkdir .extract (cd .extract; ar xv ../libpython1.5.a) gcc -shared -o libpython1.5.so .extract/*.o rm -rf .extract
In file included from /usr/include/sys/stream.h:26, from /usr/include/netinet/in.h:38, from /usr/include/netdb.h:96, from ./socketmodule.c:121: /usr/include/sys/model.h:32: #error "No DATAMODEL_NATIVE specified"Solution: rebuild GCC for Solaris 2.6. You might be able to simply re-run fixincludes, but people have had mixed success with doing that.
Use "make clean" to reduce the size of the source/build directory after you're happy with your build and installation. If you have already tried to build python and you'd like to start over, you should use "make clobber". It does a "make clean" and also removes files such as the partially built Python library from a previous build.
Bugs: http://sourceforge.net/bugs/?group_id=5470
Patches: http://sourceforge.net/patch/?group_id=5470
symbol PyExc_RuntimeError: referenced symbol not found
There is a problem with the configure script for Python 1.5.2 under Solaris 7 with gcc 2.95 . configure should set the make variable LINKFORSHARED=-Xlinker --export-dynamic
in Modules/Makefile,
Manually add this line to the Modules/Makefile. This builds a Python executable that can load shared library extensions (xxx.so) .
Pythonwin also has a GUI debugger available, based on bdb, which colors breakpoints and has quite a few cool features (including debugging non-Pythonwin programs). The interface needs some work, but is interesting none the less. A reference can be found in
http://www.python.org/ftp/python/pythonwin/pwindex.htmlMore recent version of PythonWin are available as a part of ActivePython. See
http://www.activestate.com/Products/ActivePython/index.htmlRichard Wolff has created a modified version of pdb, called Pydb, for use with the popular Data Display Debugger (DDD). Pydb can be found at http://daikon.tuc.noao.edu/python/, and DDD can be found at http://www.cs.tu-bs.de/softech/ddd/
The IDLE interactive development environment, normally available at Tools/idle in a standard distribution, also contains a graphical debugger.
# A user-defined class behaving almost identical # to a built-in dictionary. class UserDict: def __init__(self): self.data = {} def __repr__(self): return repr(self.data) def __cmp__(self, dict): if type(dict) == type(self.data): return cmp(self.data, dict) else: return cmp(self.data, dict.data) def __len__(self): return len(self.data) def __getitem__(self, key): return self.data[key] def __setitem__(self, key, item): self.data[key] = item def __delitem__(self, key): del self.data[key] def keys(self): return self.data.keys() def items(self): return self.data.items() def values(self): return self.data.values() def has_key(self, key): return self.data.has_key(key)A2. See Jim Fulton's ExtensionClass for an example of a mechanism which allows you to have superclasses which you can inherit from in Python -- that way you can have some methods from a C superclass (call it a mixin) and some methods from either a Python superclass or your subclass. See http://www.digicool.com/papers/ExtensionClass.html.
A3. The Boost Python Library (BPL, http://www.boost.org/libs/python/doc/index.html) provides a way of doing this from C++ (i.e. you can inherit from an extension class written in C++ using the BPL).
In Python 2.0, the curses module has been greatly extended, starting from Oliver Andrich's enhanced version, to provide many additional functions from ncurses and SYSV curses, such as colour, alternative character set support, pads, and mouse support. This means the module is no longer compatible with operating systems that only have BSD curses, but there don't seem to be any currently maintained OSes that fall into this category.
For Python 1.5.2: You need to import sys and assign a function to sys.exitfunc, it will be called when your program exits, is killed by an unhandled exception, or (on UNIX) receives a SIGHUP or SIGTERM signal.
class MultiplierClass: def __init__(self, factor): self.factor = factor def multiplier(self, argument): return argument * self.factor
def generate_multiplier(factor): return MultiplierClass(factor).multiplier
twice = generate_multiplier(2) print twice(10) # Output: 20An alternative solution uses default arguments, e.g.:
def generate_multiplier(factor): def multiplier(arg, fact = factor): return arg*fact return multiplier
twice = generate_multiplier(2) print twice(10) # Output: 20
list.reverse() try: for x in list: "do something with x" finally: list.reverse()This has the disadvantage that while you are in the loop, the list is temporarily reversed. If you don't like this, you can make a copy. This appears expensive but is actually faster than other solutions:
rev = list[:] rev.reverse() for x in rev: <do something with x>If it's not a list, a more general but slower solution is:
for i in range(len(sequence)-1, -1, -1): x = sequence[i] <do something with x>A more elegant solution, is to define a class which acts as a sequence and yields the elements in reverse order (solution due to Steve Majewski):
class Rev: def __init__(self, seq): self.forw = seq def __len__(self): return len(self.forw) def __getitem__(self, i): return self.forw[-(i + 1)]You can now simply write:
for x in Rev(list): <do something with x>Unfortunately, this solution is slowest of all, due to the method call overhead...
Remember that many standard optimization heuristics you may know from other programming experience may well apply to Python. For example it may be faster to send output to output devices using larger writes rather than smaller ones in order to avoid the overhead of kernel system calls. Thus CGI scripts that write all output in "one shot" may be notably faster than those that write lots of small pieces of output.
Also, be sure to use "aggregate" operations where appropriate. For example the "slicing" feature allows programs to chop up lists and other sequence objects in a single tick of the interpreter mainloop using highly optimized C implementations. Thus to get the same effect as
L2 = [] for i in range[3]: L2.append(L1[i])it is much shorter and far faster to use
L2 = list(L1[:3]) # "list" is redundant if L1 is a list.Note that the map() function, particularly used with builtin methods or builtin functions can be a convenient accellerator. For example to pair the elements of two lists together:
>>> map(None, [1,2,3], [4,5,6]) [(1, 4), (2, 5), (3, 6)]or to compute a number of sines:
>>> map( math.sin, (1,2,3,4)) [0.841470984808, 0.909297426826, 0.14112000806, -0.756802495308]The map operation completes very quickly in such cases.
Other examples of aggregate operations include the join, joinfields, split, and splitfields methods of the standard string builtin module. For example if s1..s7 are large (10K+) strings then string.joinfields([s1,s2,s3,s4,s5,s6,s7], "") may be far faster than the more obvious s1+s2+s3+s4+s5+s6+s7, since the "summation" will compute many subexpressions, whereas joinfields does all copying in one pass. For manipulating strings also consider the regular expression libraries and the "substitution" operations String % tuple and String % dictionary. Also be sure to use the list.sort builtin method to do sorting, and see FAQ's 4.51 and 4.59 for examples of moderately advanced usage -- list.sort beats other techniques for sorting in all but the most extreme circumstances.
There are many other aggregate operations available in the standard libraries and in contributed libraries and extensions.
Another common trick is to "push loops into functions or methods." For example suppose you have a program that runs slowly and you use the profiler (profile.run) to determine that a Python function ff is being called lots of times. If you notice that ff
def ff(x): ...do something with x computing result... return resulttends to be called in loops like (A)
list = map(ff, oldlist)or (B)
for x in sequence: value = ff(x) ...do something with value...then you can often eliminate function call overhead by rewriting ff to
def ffseq(seq): resultseq = [] for x in seq: ...do something with x computing result... resultseq.append(result) return resultseqand rewrite (A) to
list = ffseq(oldlist)and (B) to
for value in ffseq(sequence): ...do something with value...Other single calls ff(x) translate to ffseq([x])[0] with little penalty. Of course this technique is not always appropriate and there are other variants, which you can figure out.
You can gain some performance by explicitly storing the results of a function or method lookup into a local variable. A loop like
for key in token: dict[key] = dict.get(key, 0) + 1resolves dict.get every iteration. If the method isn't going to change, a faster implementation is
dict_get = dict.get # look up the method once for key in token: dict[key] = dict_get(key, 0) + 1Default arguments can be used to determine values once, at compile time instead of at run time. This can only be done for functions or objects which will not be changed during program execution, such as replacing
def degree_sin(deg): return math.sin(deg * math.pi / 180.0)with
def degree_sin(deg, factor = math.pi/180.0, sin = math.sin): return sin(deg * factor)Because this trick uses default arguments for terms which should not be changed, it should only be used when you are not concerned with presenting a possibly confusing API to your users.
For an anecdote related to optimization, see
http://www.python.org/doc/essays/list2str.html
import modname reload(modname)Warning: this technique is not 100% fool-proof. In particular, modules containing statements like
from modname import some_objectswill continue to work with the old version of the imported objects.
if __name__ == '__main__': main()
NOTE: if the complaint is about "Tkinter" (upper case T) and you have already configured module "tkinter" (lower case t), the solution is not to rename tkinter to Tkinter or vice versa. There is probably something wrong with your module search path. Check out the value of sys.path.
For X-related modules (Xt and Xm) you will have to do more work: they are currently not part of the standard Python distribution. You will have to ftp the Extensions tar file, i.e. ftp://ftp.python.org/pub/python/src/X-extension.tar.gz and follow the instructions there.
See also the next question.
Currently supported solutions:
Cross-platform:
There's a neat object-oriented interface to the Tcl/Tk widget set, called Tkinter. It is part of the standard Python distribution and well-supported -- all you need to do is build and install Tcl/Tk and enable the _tkinter module and the TKPATH definition in Modules/Setup when building Python. This is probably the easiest to install and use, and the most complete widget set. It is also very likely that in the future the standard Python GUI API will be based on or at least look very much like the Tkinter interface. For more info about Tk, including pointers to the source, see the Tcl/Tk home page at http://www.scriptics.com. Tcl/Tk is now fully portable to the Mac and Windows platforms (NT and 95 only); you need Python 1.4beta3 or later and Tk 4.1patch1 or later.
There's an interface to wxWindows called wxPython. wxWindows is a portable GUI class library written in C++. It supports GTK, Motif, MS-Windows and Mac as targets. Ports to other platforms are being contemplated or have already had some work done on them. wxWindows preserves the look and feel of the underlying graphics toolkit, and there is quite a rich widget set and collection of GDI classes. See the wxWindows page at http://www.wxwindows.org/ for more details. wxPython is a python extension module that wraps many of the wxWindows C++ classes, and is quickly gaining popularity amongst Python developers. You can get wxPython as part of the source or CVS distribution of wxWindows, or directly from its home page at http://alldunn.com/wxPython/.
Bindings to Gnome and the GIMP Toolkit by James Henstridge exist; see ftp://ftp.daa.com.au/pub/james/python/.
For KDE bindings, see ftp://ftp.kde.org/pub/kde/devel/kde-bindings or http://www.river-bank.demon.co.uk/software/.
For OpenGL bindings, see http://starship.python.net/~da/PyOpenGL.
Platform specific:
The Mac port has a rich and ever-growing set of modules that support the native Mac toolbox calls. See the documentation that comes with the Mac port. See ftp://ftp.python.org/pub/python/mac. Support by Jack Jansen [email protected].
Pythonwin by Mark Hammond ([email protected]) includes an interface to the Microsoft Foundation Classes and a Python programming environment using it that's written mostly in Python. See http://www.python.org/windows/.
There's an object-oriented GUI based on the Microsoft Foundation Classes model called WPY, supported by Jim Ahlstrom [email protected]. Programs written in WPY run unchanged and with native look and feel on Windows NT/95, Windows 3.1 (using win32s), and on Unix (using Tk). Source and binaries for Windows and Linux are available in ftp://ftp.python.org/pub/python/wpy/.
Obsolete or minority solutions:
There's an interface to X11, including the Athena and Motif widget sets (and a few individual widgets, like Mosaic's HTML widget and SGI's GL widget) available from ftp://ftp.python.org/pub/python/src/X-extension.tar.gz. Support by Sjoerd Mullender [email protected].
On top of the X11 interface there's the vpApp toolkit by Per Spilling, now also maintained by Sjoerd Mullender [email protected]. See ftp://ftp.cwi.nl/pub/sjoerd/vpApp.tar.gz.
For SGI IRIX only, there are unsupported interfaces to the complete GL (Graphics Library -- low level but very good 3D capabilities) as well as to FORMS (a buttons-and-sliders-etc package built on top of GL by Mark Overmars -- ftp'able from ftp://ftp.cs.ruu.nl/pub/SGI/FORMS/). This is probably also becoming obsolete, as OpenGL takes over (see above).
There's an interface to STDWIN, a platform-independent low-level windowing interface for Mac and X11. This is totally unsupported and rapidly becoming obsolete. The STDWIN sources are at ftp://ftp.cwi.nl/pub/stdwin/.
There is an interface to WAFE, a Tcl interface to the X11 Motif and Athena widget sets. WAFE is at http://www.wu-wien.ac.at/wafe/wafe.html.
# Primes < 1000 print filter(None,map(lambda y:y*reduce(lambda x,y:x*y!=0, map(lambda x,y=y:y%x,range(2,int(pow(y,0.5)+1))),1),range(2,1000)))
# First 10 Fibonacci numbers print map(lambda x,f=lambda x,f:(x<=1) or (f(x-1,f)+f(x-2,f)): f(x,f), range(10))
# Mandelbrot set print (lambda Ru,Ro,Iu,Io,IM,Sx,Sy:reduce(lambda x,y:x+y,map(lambda y, Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,Sy=Sy,L=lambda yc,Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,i=IM, Sx=Sx,Sy=Sy:reduce(lambda x,y:x+y,map(lambda x,xc=Ru,yc=yc,Ru=Ru,Ro=Ro, i=i,Sx=Sx,F=lambda xc,yc,x,y,k,f=lambda xc,yc,x,y,k,f:(k<=0)or (x*x+y*y >=4.0) or 1+f(xc,yc,x*x-y*y+xc,2.0*x*y+yc,k-1,f):f(xc,yc,x,y,k,f):chr( 64+F(Ru+x*(Ro-Ru)/Sx,yc,0,0,i)),range(Sx))):L(Iu+y*(Io-Iu)/Sy),range(Sy ))))(-2.1, 0.7, -1.2, 1.2, 30, 80, 24) # \___ ___/ \___ ___/ | | |__ lines on screen # V V | |______ columns on screen # | | |__________ maximum of "iterations" # | |_________________ range on y axis # |____________________________ range on x axisDon't try this at home, kids!
Tim Peters (who wishes it was Steve Majewski) suggested the following solution: (a and [b] or [c])[0]. Because [b] is a singleton list it is never false, so the wrong path is never taken; then applying [0] to the whole thing gets the b or c that you really wanted. Ugly, but it gets you there in the rare cases where it is really inconvenient to rewrite your code using 'if'.
The del statement does not necessarily call __del__ -- it simply decrements the object's reference count, and if this reaches zero __del__ is called.
If your data structures contain circular links (e.g. a tree where each child has a parent pointer and each parent has a list of children) the reference counts will never go back to zero. You'll have to define an explicit close() method which removes those pointers. Please don't ever call __del__ directly -- __del__ should call close() and close() should make sure that it can be called more than once for the same object.
If the object has ever been a local variable (or argument, which is really the same thing) to a function that caught an expression in an except clause, chances are that a reference to the object still exists in that function's stack frame as contained in the stack trace. Normally, deleting (better: assigning None to) sys.exc_traceback will take care of this. If a stack was printed for an unhandled exception in an interactive interpreter, delete sys.last_traceback instead.
There is code that deletes all objects when the interpreter exits, but it is not called if your Python has been configured to support threads (because other threads may still be active). You can define your own cleanup function using sys.exitfunc (see question 4.4).
Finally, if your __del__ method raises an exception, a warning message is printed to sys.stderr.
Starting with Python 2.0, a garbage collector capable of reclaiming the space used by many cycles with no external references. There are, however, pathological cases where it can be expected to fail, so making sure your programs break such cycles is always safest.
Question 6.14 is intended to explain the new garbage collection algorithm.
Before Python 1.4, modifying the environment passed to subshells was left out of the interpreter because there seemed to be no well-established portable way to do it (in particular, some systems, have putenv(), others have setenv(), and some have none at all). As of Python 1.4, almost all Unix systems do have putenv(), and so does the Win32 API, and thus the os module was modified so that changes to os.environ are trapped and the corresponding putenv() call is made.
Trivia note regarding bound methods: each reference to a bound method of a particular object creates a bound method object. If you have two such references (a = inst.meth; b = inst.meth), they will compare equal (a == b) but are not the same (a is not b).
Often when you want to do this you are forgetting that classes are first class in Python. You can "point to" the class you want to delegate an operation to either at the instance or at the subclass level. For example if you want to use a "glorp" operation of a superclass you can point to the right superclass to use.
class subclass(superclass1, superclass2, superclass3): delegate_glorp = superclass2 ... def glorp(self, arg1, arg2): ... subclass specific stuff ... self.delegate_glorp.glorp(self, arg1, arg2) ...
class subsubclass(subclass): delegate_glorp = superclass3 ...Note, however that setting delegate_glorp to subclass in subsubclass would cause an infinite recursion on subclass.delegate_glorp. Careful! Maybe you are getting too fancy for your own good. Consider simplifying the design (?).
BaseAlias = <real base class> class Derived(BaseAlias): def meth(self): BaseAlias.meth(self) ...
For an instance x of a user-defined class, instance attributes are found in the dictionary x.__dict__, and methods and attributes defined by its class are found in x.__class__.__bases__[i].__dict__ (for i in range(len(x.__class__.__bases__))). You'll have to walk the tree of base classes to find all class methods and attributes.
Many, but not all built-in types define a list of their method names in x.__methods__, and if they have data attributes, their names may be found in x.__members__. However this is only a convention.
For more information, read the source of the standard (but undocumented) module newdir.
This works by scanning your source recursively for import statements both forms) and looking for the modules on the standard Python path as well as in the source directory (for built-in modules). It then "compiles" the modules written in Python to C code (array initializers that can be turned into code objects using the marshal module) and creates a custom-made config file that only contains those built-in modules which are actually used in the program. It then compiles the generated C code and links it with the rest of the Python interpreter to form a self-contained binary which acts exactly like your script.
Hint: the freeze program only works if your script's filename ends in ".py".
If you want to do this under windows, there are two utilities which may be helpful. The first is Gordon McMillan's installer at
http://starship.python.net/crew/gmcm/install.htmland the second is Thomas Heller's py2exe at
http://starship.python.net/crew/theller/py2exe/This latter tool is still under development, but has already been used to make a standalone .exe file for a Python COM server.
If you are interested in Python web frameworks, a good site to visit is
http://groups.yahoo.com/group/python-web-modules.There's also a web browser written in Python, called Grail -- see http://grail.cnri.reston.va.us/grail/. It isn't clear how much maintenance this code has received in recent years, however.
import popen2 fromchild, tochild = popen2.popen2("command") tochild.write("input\n") tochild.flush() output = fromchild.readline()Warning: in general, it is unwise to do this, because you can easily cause a deadlock where your process is blocked waiting for output from the child, while the child is blocked waiting for input from you. This can be caused because the parent expects the child to output more text than it does, or it can be caused by data being stuck in stdio buffers due to lack of flushing. The Python parent can of course explicitly flush the data it sends to the child before it reads any output, but if the child is a naive C program it can easily have been written to never explicitly flush its output, even if it is interactive, since flushing is normally automatic.
Note on a bug in popen2: unless your program calls wait() or waitpid(), finished child processes are never removed, and eventually calls to popen2 will fail because of a limit on the number of child processes. Calling os.waitpid with the os.WNOHANG option can prevent this; a good place to insert such a call would be before calling popen2 again.
In many cases, all you really need is to run some data through a command and get the result back. Unless the data is infinite in size, the easiest (and often the most efficient!) way to do this is to write it to a temporary file and run the command with that temporary file as input. The standard module tempfile exports a function mktemp() which generates unique temporary file names.
Note that many interactive programs (e.g. vi) don't work well with pipes substituted for standard input and output. You will have to use pseudo ttys ("ptys") instead of pipes. There is some undocumented code to use these in the library module pty.py -- I'm afraid you're on your own here.
A different answer is a Python interface to Don Libes' "expect" library. A Python extension that interfaces to expect is called "expy" and available from http://expectpy.sourceforge.net/.
A pure Python solution that works like expect is PIPE by John Croix. A prerelease of PIPE is available from ftp://ftp.python.org/pub/python/contrib/System/.
func(1, 2, 3)is equivalent to
args = (1, 2, 3) apply(func, args)Note that func(args) is not the same -- it calls func() with exactly one argument, the tuple args, instead of three arguments, the integers 1, 2 and 3.
In Python 2.0, you can also use extended call syntax:
f(*args) is equivalent to apply(f, args)
If you are using an older version of XEmacs or Emacs you will need to put this in your .emacs file:
(defun my-python-mode-hook () (setq font-lock-keywords python-font-lock-keywords) (font-lock-mode 1)) (add-hook 'python-mode-hook 'my-python-mode-hook)
For simple input parsing, the easiest approach is usually to split the line into whitespace-delimited words using string.split(), and to convert decimal strings to numeric values using string.atoi(), string.atol() or string.atof(). (Python's atoi() is 32-bit and its atol() is arbitrary precision.) string.split supports an optional "sep" parameter which is useful if the line uses something other than whitespace as a delimiter.
For more complicated input parsing, regular expressions (see module re) are better suited and more powerful than C's sscanf().
There's a contributed module that emulates sscanf(), by Steve Clift; see contrib/Misc/sscanfmodule.c of the ftp site:
http://www.python.org/ftp/python/contrib-09-Dec-1999/Misc/
from Tkinter import tkinter tkinter.createfilehandler(file, mask, callback)The file may be a Python file or socket object (actually, anything with a fileno() method), or an integer file descriptor. The mask is one of the constants tkinter.READABLE or tkinter.WRITABLE. The callback is called as follows:
callback(file, mask)You must unregister the callback when you're done, using
tkinter.deletefilehandler(file)Note: since you don't know *how many bytes* are available for reading, you can't use the Python file object's read or readline methods, since these will insist on reading a predefined number of bytes. For sockets, the recv() or recvfrom() methods will work fine; for other files, use os.read(file.fileno(), maxbytecount).
1) By using global variables; but you probably shouldn't :-)
2) By passing a mutable (changeable in-place) object:
def func1(a): a[0] = 'new-value' # 'a' references a mutable list a[1] = a[1] + 1 # changes a shared object
args = ['old-value', 99] func1(args) print args[0], args[1] # output: new-value 1003) By returning a tuple, holding the final values of arguments:
def func2(a, b): a = 'new-value' # a and b are local names b = b + 1 # assigned to new objects return a, b # return new values
x, y = 'old-value', 99 x, y = func2(x, y) print x, y # output: new-value 1004) And other ideas that fall-out from Python's object model. For instance, it might be clearer to pass in a mutable dictionary:
def func3(args): args['a'] = 'new-value' # args is a mutable dictionary args['b'] = args['b'] + 1 # change it in-place
args = {'a':' old-value', 'b': 99} func3(args) print args['a'], args['b']5) Or bundle-up values in a class instance:
class callByRef: def __init__(self, **args): for (key, value) in args.items(): setattr(self, key, value)
def func4(args): args.a = 'new-value' # args is a mutable callByRef args.b = args.b + 1 # change object in-place
args = callByRef(a='old-value', b=99) func4(args) print args.a, args.b
But there's probably no good reason to get this complicated :-).[Python's author favors solution 3 in most cases.]
Though a bit surprising at first, a moment's consideration explains this. On one hand, requirement of 'global' for assigned vars provides a bar against unintended side-effects. On the other hand, if global were required for all global references, you'd be using global all the time. Eg, you'd have to declare as global every reference to a builtin function, or to a component of an imported module. This clutter would defeat the usefulness of the 'global' declaration for identifying side-effects.
First: all exports (like globals, functions, and classes that don't need imported base classes).
Then: all import statements.
Finally: all active code (including globals that are initialized from imported values).
Python's author doesn't like this approach much because the imports appear in a strange place, but has to admit that it works. His recommended strategy is to avoid all uses of "from <module> import *" (so everything from an imported module is referenced as <module>.<name>) and to place all code inside functions. Initializations of global variables and class variables should use constants or built-in functions only.
For immutable objects (numbers, strings, tuples), cloning is unnecessary since their value can't change. For lists (and generally for mutable sequence types), a clone is created by the expression l[:]. For dictionaries, the following function returns a clone:
def dictclone(o): n = {} for k in o.keys(): n[k] = o[k] return nFinally, for generic objects, the "copy" module defines two functions for copying objects. copy.copy(x) returns a copy as shown by the above rules. copy.deepcopy(x) also copies the elements of composite objects. See the section on this module in the Library Reference Manual.
A more awkward way of doing things is to use pickle's little sister, marshal. The marshal module provides very fast ways to store noncircular basic Python types to files and strings, and back again. Although marshal does not do fancy things like store instances or handle shared references properly, it does run extremely fast. For example loading a half megabyte of data may take less than a third of a second (on some machines). This often beats doing something more complex and general such as using gdbm with pickle/shelve.
To remove a directory, use os.rmdir(); use os.mkdir() to create one.
To rename a file, use os.rename().
To truncate a file, open it using f = open(filename, "r+"), and use f.truncate(offset); offset defaults to the current seek position. (The "r+" mode opens the file for reading and writing.) There's also os.ftruncate(fd, offset) for files opened with os.open() -- for advanced Unix hacks only.
The shutil module also contains a number of functions to work on files including copyfile, copytree, and rmtree amongst others.
Note that these are restricted to decimal interpretation, so that int('0144') == 144 and int('0x144') raises ValueError. For Python 2.0 int takes the base to convert from as a second optional argument, so int('0x144', 16) == 324.
For greater flexibility, or before Python 1.5, import the module string and use the string.atoi() function for integers, string.atol() for long integers, or string.atof() for floating-point. E.g., string.atoi('100', 16) == string.atoi('0x100', 0) == 256. See the library reference manual section for the string module for more details.
While you could use the built-in function eval() instead of any of those, this is not recommended, because someone could pass you a Python expression that might have unwanted side effects (like reformatting your disk). It also has the effect of interpreting numbers as Python expressions, so that e.g. eval('09') gives a syntax error since Python regards numbers starting with '0' as octal (base 8).
However, there are some legitimate situations where you need to test for class membership.
In Python 1.5, you can use the built-in function isinstance(obj, cls).
The following approaches can be used with earlier Python versions:
An unobvious method is to raise the object as an exception and to try to catch the exception with the class you're testing for:
def is_instance_of(the_instance, the_class): try: raise the_instance except the_class: return 1 except: return 0This technique can be used to distinguish "subclassness" from a collection of classes as well
try: raise the_instance except Audible: the_instance.play(largo) except Visual: the_instance.display(gaudy) except Olfactory: sniff(the_instance) except: raise ValueError, "dunno what to do with this!"This uses the fact that exception catching tests for class or subclass membership.
A different approach is to test for the presence of a class attribute that is presumably unique for the given class. For instance:
class MyClass: ThisIsMyClass = 1 ...
def is_a_MyClass(the_instance): return hasattr(the_instance, 'ThisIsMyClass')This version is easier to inline, and probably faster (inlined it is definitely faster). The disadvantage is that someone else could cheat:
class IntruderClass: ThisIsMyClass = 1 # Masquerade as MyClass ...but this may be seen as a feature (anyway, there are plenty of other ways to cheat in Python). Another disadvantage is that the class must be prepared for the membership test. If you do not "control the source code" for the class it may not be advisable to modify the class to support testability.
from string import upper
class UpperOut: def __init__(self, outfile): self.__outfile = outfile def write(self, str): self.__outfile.write( upper(str) ) def __getattr__(self, name): return getattr(self.__outfile, name)Here the UpperOut class redefines the write method to convert the argument string to upper case before calling the underlying self.__outfile.write method, but all other methods are delegated to the underlying self.__outfile object. The delegation is accomplished via the "magic" __getattr__ method. Please see the language reference for more information on the use of this method.
Note that for more general cases delegation can get trickier. Particularly when attributes must be set as well as gotten the class must define a __settattr__ method too, and it must do so carefully.
The basic implementation of __setattr__ is roughly equivalent to the following:
class X: ... def __setattr__(self, name, value): self.__dict__[name] = value ...Most __setattr__ implementations must modify self.__dict__ to store local state for self without causing an infinite recursion.
http://pyunit.sourceforge.net/For standalone testing, it helps to write the program so that it may be easily tested by using good modular design. In particular your program should have almost all functionality encapsulated in either functions or class methods -- and this sometimes has the surprising and delightful effect of making the program run faster (because local variable accesses are faster than global accesses). Furthermore the program should avoid depending on mutating global variables, since this makes testing much more difficult to do.
The "global main logic" of your program may be as simple as
if __name__=="__main__": main_logic()at the bottom of the main module of your program.
Once your program is organized as a tractible collection of functions and class behaviours you should write test functions that exercise the behaviours. A test suite can be associated with each module which automates a sequence of tests. This sounds like a lot of work, but since Python is so terse and flexible it's surprisingly easy. You can make coding much more pleasant and fun by writing your test functions in parallel with the "production code", since this makes it easy to find bugs and even design flaws earlier.
"Support modules" that are not intended to be the main module of a program may include a "test script interpretation" which invokes a self test of the module.
if __name__ == "__main__": self_test()Even programs that interact with complex external interfaces may be tested when the external interfaces are unavailable by using "fake" interfaces implemented in Python. For an example of a "fake" interface, the following class defines (part of) a "fake" file interface:
import string testdata = "just a random sequence of characters"
class FakeInputFile: data = testdata position = 0 closed = 0
def read(self, n=None): self.testclosed() p = self.position if n is None: result= self.data[p:] else: result= self.data[p: p+n] self.position = p + len(result) return result
def seek(self, n, m=0): self.testclosed() last = len(self.data) p = self.position if m==0: final=n elif m==1: final=n+p elif m==2: final=len(self.data)+n else: raise ValueError, "bad m" if final<0: raise IOError, "negative seek" self.position = final
def isatty(self): return 0
def tell(self): return self.position
def close(self): self.closed = 1
def testclosed(self): if self.closed: raise IOError, "file closed"Try f=FakeInputFile() and test out its operations.
A = [[None] * 2] * 3This makes a list containing 3 references to the same list of length two. Changes to one row will show in all rows, which is probably not what you want. The following works much better:
A = [None]*3 for i in range(3): A[i] = [None] * 2This generates a list containing 3 different lists of length two.
If you feel weird, you can also do it in the following way:
w, h = 2, 3 A = map(lambda i,w=w: [None] * w, range(h))For Python 2.0 the above can be spelled using a list comprehension:
w,h = 2,3 A = [ [None]*w for i in range(h) ]
def st(List, Metric): def pairing(element, M = Metric): return (M(element), element) paired = map(pairing, List) paired.sort() return map(stripit, paired)
def stripit(pair): return pair[1]This technique, attributed to Randal Schwartz, sorts the elements of a list by a metric which maps each element to its "sort value". For example, if L is a list of string then
import string Usorted = st(L, string.upper)
def intfield(s): return string.atoi( string.strip(s[10:15] ) )
Isorted = st(L, intfield)Usorted gives the elements of L sorted as if they were upper case, and Isorted gives the elements of L sorted by the integer values that appear in the string slices starting at position 10 and ending at position 15. In Python 2.0 this can be done more naturally with list comprehensions:
import string tmp1 = [ (string.upper(x),x) for x in L ] # Schwartzian transform tmp1.sort() Usorted = [ x[1] for x in tmp1 ]
tmp2 = [ (int(s[10:15]), s) for s in L ] # Schwartzian transform tmp2.sort() Isorted = [ x[1] for x in tmp2 ]
Note that Isorted may also be computed by
def Icmp(s1, s2): return cmp( intfield(s1), intfield(s2) )
Isorted = L[:] Isorted.sort(Icmp)but since this method computes intfield many times for each element of L, it is slower than the Schwartzian Transform.
The function list(seq) converts any sequence into a list with the same items in the same order. For example, list((1, 2, 3)) yields [1, 2, 3] and list('abc') yields ['a', 'b', 'c']. If the argument is a list, it makes a copy just like seq[:] would.
You can program the class's constructor to keep track of all instances, but unless you're very clever, this has the disadvantage that the instances never get deleted,because your list of all instances keeps a reference to them.
(The trick is to regularly inspect the reference counts of the instances you've retained, and if the reference count is below a certain level, remove it from the list. Determining that level is tricky -- it's definitely larger than 1.)
regex.match('.*x',"x"*5000)will fail.
This is fixed in the re module introduced with Python 1.5; consult the Library Reference section on re for more information.
handler(signum, frame)so it should be declared with two arguments:
def handler(signum, frame): ...
x = 1 # make a global
def f(): print x # try to print the global ... for j in range(100): if q>3: x=4Any variable assigned in a function is local to that function. unless it is specifically declared global. Since a value is bound to x as the last statement of the function body, the compiler assumes that x is local. Consequently the "print x" attempts to print an uninitialized local variable and will trigger a NameError.
In such cases the solution is to insert an explicit global declaration at the start of the function, making it
def f(): global x print x # try to print the global ... for j in range(100): if q>3: x=4
In this case, all references to x are interpreted as references to the x from the module namespace.
Using negative indices can be very convenient. For example if the string Line ends in a newline then Line[:-1] is all of Line except the newline.
Sadly the list builtin method L.insert does not observe negative indices. This feature could be considered a mistake but since existing programs depend on this feature it may stay around forever. L.insert for negative indices inserts at the start of the list. To get "proper" negative index behaviour use L[n:n] = [x] in place of the insert method.
>>> list1 = ["what", "I'm", "sorting", "by"] >>> list2 = ["something", "else", "to", "sort"] >>> pairs = map(None, list1, list2) >>> pairs [('what', 'something'), ("I'm", 'else'), ('sorting', 'to'), ('by', 'sort')] >>> pairs.sort() >>> pairs [("I'm", 'else'), ('by', 'sort'), ('sorting', 'to'), ('what', 'something')] >>> result = pairs[:] >>> for i in xrange(len(result)): result[i] = result[i][1] ... >>> result ['else', 'sort', 'to', 'something']And if you didn't understand the question, please see the example above ;c). Note that "I'm" sorts before "by" because uppercase "I" comes before lowercase "b" in the ascii order. Also see 4.51.
In Python 2.0 this can be done like:
>>> list1 = ["what", "I'm", "sorting", "by"] >>> list2 = ["something", "else", "to", "sort"] >>> pairs = zip(list1, list2) >>> pairs [('what', 'something'), ("I'm", 'else'), ('sorting', 'to'), ('by', 'sort')] >>> pairs.sort() >>> result = [ x[1] for x in pairs ] >>> result ['else', 'sort', 'to', 'something'][Followup]
Someone asked, why not this for the last steps:
result = [] for p in pairs: result.append(p[1])This is much more legible. However, a quick test shows that it is almost twice as slow for long lists. Why? First of all, the append() operation has to reallocate memory, and while it uses some tricks to avoid doing that each time, it still has to do it occasionally, and apparently that costs quite a bit. Second, the expression "result.append" requires an extra attribute lookup. The attribute lookup could be done away with by rewriting as follows:
result = [] append = result.append for p in pairs: append(p[1])which gains back some speed, but is still considerably slower than the original solution, and hardly less convoluted.
Using 1.4, you can find out which methods a given object supports by looking at its __methods__ attribute:
>>> List = [] >>> List.__methods__ ['append', 'count', 'index', 'insert', 'remove', 'reverse', 'sort']
Yes. Here's a simple example that uses httplib.
#!/usr/local/bin/python
import httplib, sys, time
### build the query string qs = "First=Josephine&MI=Q&Last=Public"
### connect and send the server a path httpobj = httplib.HTTP('www.some-server.out-there', 80) httpobj.putrequest('POST', '/cgi-bin/some-cgi-script') ### now generate the rest of the HTTP headers... httpobj.putheader('Accept', '*/*') httpobj.putheader('Connection', 'Keep-Alive') httpobj.putheader('Content-type', 'application/x-www-form-urlencoded') httpobj.putheader('Content-length', '%d' % len(qs)) httpobj.endheaders() httpobj.send(qs) ### find out what the server said in response... reply, msg, hdrs = httpobj.getreply() if reply != 200: sys.stdout.write(httpobj.getfile().read())Note that in general for "url encoded posts" (the default) query strings must be "quoted" to, for example, change equals signs and spaces to an encoded form when they occur in name or value. Use urllib.quote to perform this quoting. For example to send name="Guy Steele, Jr.":
>>> from urllib import quote >>> x = quote("Guy Steele, Jr.") >>> x 'Guy%20Steele,%20Jr.' >>> query_string = "name="+x >>> query_string 'name=Guy%20Steele,%20Jr.'
If you have initialized a new bsddb database but not written anything to it before the program crashes, you will often wind up with a zero-length file and encounter an exception the next time the file is opened.
The first is done by executing 'chmod +x scriptfile' or perhaps 'chmod 755 scriptfile'.
The second can be done in a number of way. The most straightforward way is to write
#!/usr/local/bin/pythonas the very first line of your file - or whatever the pathname is where the python interpreter is installed on your platform.
If you would like the script to be independent of where the python interpreter lives, you can use the "env" program. On almost all platforms, the following will work, assuming the python interpreter is in a directory on the user's $PATH:
#! /usr/bin/env pythonNote -- *don't* do this for CGI scripts. The $PATH variable for CGI scripts is often very minimal, so you need to use the actual absolute pathname of the interpreter.
Occasionally, a user's environment is so full that the /usr/bin/env program fails; or there's no env program at all. In that case, you can try the following hack (due to Alex Rezinsky):
#! /bin/sh """:" exec python $0 ${1+"$@"} """The disadvantage is that this defines the script's __doc__ string. However, you can fix that by adding
__doc__ = """...Whatever..."""
if List: List.sort() last = List[-1] for i in range(len(List)-2, -1, -1): if last==List[i]: del List[i] else: last=List[i]If all elements of the list may be used as dictionary keys (ie, they are all hashable) this is often faster
d = {} for x in List: d[x]=x List = d.values()Also, for extremely large lists you might consider more optimal alternatives to the first one. The second one is pretty good whenever it can be used.
Given the nature of freely available software, I have to add that this statement is not legally binding. The Python copyright notice contains the following disclaimer:
STICHTING MATHEMATISCH CENTRUM AND CNRI DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM OR CNRI BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.The good news is that if you encounter a problem, you have full source available to track it down and fix it!
def method_map(objects, method, arguments): """method_map([a,b], "flog", (1,2)) gives [a.flog(1,2), b.flog(1,2)]""" nobjects = len(objects) methods = map(getattr, objects, [method]*nobjects) return map(apply, methods, [arguments]*nobjects)It's generally a good idea to get to know the mysteries of map and apply and getattr and the other dynamic features of Python.
import whrandom
whrandom.random()This returns a random floating point number in the range [0, 1).
There are also other specialized generators in this module:
randint(a, b) chooses an integer in the range [a, b) choice(S) chooses from a given sequence uniform(a, b) chooses a floating point number in the range [a, b)To force the random number generator's initial setting, use
seed(x, y, z) set the seed from three integers in [1, 256)There's also a class, whrandom, whoch you can instantiate to create independent multiple random number generators.
The module "random" contains functions that approximate various standard distributions.
All this is documented in the library reference manual. Note that the module "rand" is obsolete.
ftp://ftp.python.org/pub/python/contrib/sio-151.zip http://www.python.org/ftp/python/contrib/sio-151.zipFor DOS, try Hans Nowak's Python-DX, which supports this, at:
http://www.cuci.nl/~hnowak/For Unix, search Deja News (using http://www.python.org/search/) for "serial port" with author Mitch Chapman (his post is a little too long to include here).
Quoting Fredrik Lundh from the mailinglist:
Well, the Tk button widget keeps a reference to the internal photoimage object, but Tkinter does not. So when the last Python reference goes away, Tkinter tells Tk to release the photoimage. But since the image is in use by a widget, Tk doesn't destroy it. Not completely. It just blanks the image, making it completely transparent...
And yes, there was a bug in the keyword argument handling in 1.4 that kept an extra reference around in some cases. And when Guido fixed that bug in 1.5, he broke quite a few Tkinter programs...
Fredrik Lundh ([email protected]) explains (on the python-list):
There are (at least) three kinds of modules in Python: 1) modules written in Python (.py); 2) modules written in C and dynamically loaded (.dll, .pyd, .so, .sl, etc); 3) modules written in C and linked with the interpreter; to get a list of these, type:
import sys print sys.builtin_module_names
SENDMAIL = "/usr/sbin/sendmail" # sendmail location
import os
p = os.popen("%s -t" % SENDMAIL, "w")
p.write("To: [email protected]\n")
p.write("Subject: test\n")
p.write("\n") # blank line separating headers from body
p.write("Some text\n")
p.write("some more text\n")
sts = p.close()
if sts != 0:
print "Sendmail exit status", sts
On non-Unix systems (and on Unix systems too, of course!),
you can use SMTP to send mail to a nearby
mail server. A library for SMTP smtplib.py has been included
since Python 1.5.1. Here's a very simple interactive mail
sender that uses it. This method will work on any host that
supports an SMTP listener;
otherwise, you will have to ask the user for a host:
import sys, smtplib
fromaddr = raw_input("From: ") toaddrs = string.splitfields(raw_input("To: "), ',') print "Enter message, end with ^D:" msg = '' while 1: line = sys.stdin.readline() if not line: break msg = msg + line
# The actual mail send server = smtplib.SMTP('localhost') server.sendmail(fromaddr, toaddrs, msg) server.quit()
To prevent the TCP connect from blocking, you can set the socket to non-blocking mode. Then when you do the connect(), you will either connect immediately (unlikely) or get an exception that contains the errno. errno.EINPROGRESS indicates that the connection is in progress, but hasn't finished yet. Different OSes will return different errnos, so you're going to have to check. I can tell you that different versions of Solaris return different errno values.
In Python 1.5 and later, you can use connect_ex() to avoid creating an exception. It will just return the errno value.
To poll, you can call connect_ex() again later -- 0 or errno.EISCONN indicate that you're connected -- or you can pass this socket to select (checking to see if it is writeable).
>>> a = 010To verify that this works, you can type "a" and hit enter while in the interpreter, which will cause Python to spit out the current value of "a" in decimal:
>>> a 8Hexadecimal is just as easy. Simply precede the hexadecimal number with a zero, and then a lower or uppercase "x". Hexadecimal digits can be specified in lower or uppercase. For example, in the Python interpreter:
>>> a = 0xa5 >>> a 165 >>> b = 0XB2 >>> b 178
There are several solutions; some involve using curses, which is a pretty big thing to learn. Here's a solution without curses, due to Andrew Kuchling (adapted from code to do a PGP-style randomness pool):
import termios, TERMIOS, sys, os fd = sys.stdin.fileno() old = termios.tcgetattr(fd) new = termios.tcgetattr(fd) new[3] = new[3] & ~TERMIOS.ICANON & ~TERMIOS.ECHO new[6][TERMIOS.VMIN] = 1 new[6][TERMIOS.VTIME] = 0 termios.tcsetattr(fd, TERMIOS.TCSANOW, new) s = '' # We'll save the characters typed and add them to the pool. try: while 1: c = os.read(fd, 1) print "Got character", `c` s = s+c finally: termios.tcsetattr(fd, TERMIOS.TCSAFLUSH, old)You need the termios module for any of this to work, and I've only tried it on Linux, though it should work elsewhere. It turns off stdin's echoing and disables canonical mode, and then reads a character at a time from stdin, noting the time after each keystroke.
Where in C++ you'd write
class C { C() { cout << "No arguments\n"; } C(int i) { cout << "Argument is " << i << "\n"; } }in Python you have to write a single constructor that catches all cases using default arguments. For example:
class C: def __init__(self, i=None): if i is None: print "No arguments" else: print "Argument is", iThis is not entirely equivalent, but close enough in practice.
You could also try a variable-length argument list, e.g.
def __init__(self, *args): ....The same approach works for all method definitions.
class Account: def __init__(self, **kw): self.accountType = kw.get('accountType') self.balance = kw.get('balance')
class CheckingAccount(Account): def __init__(self, **kw): kw['accountType'] = 'checking' apply(Account.__init__, (self,), kw)
myAccount = CheckingAccount(balance=100.00)In Python 2.0 you can call it directly using the new ** syntax:
class CheckingAccount(Account): def __init__(self, **kw): kw['accountType'] = 'checking' Account.__init__(self, **kw)or more generally:
>>> def f(x, *y, **z): ... print x,y,z ... >>> Y = [1,2,3] >>> Z = {'foo':3,'bar':None} >>> f('hello', *Y, **Z) hello (1, 2, 3) {'foo': 3, 'bar': None}
It can be found in the FTP contrib area on python.org or on the Starship. Use the search engines there to locate the latest version.
It might also be useful to consider DocumentTemplate, which offers clear separation between Python code and HTML code. DocumentTemplate is part of the Bobo objects publishing system (http:/www.digicool.com/releases) but can be used independantly of course!
http://starship.skyport.net/crew/danilo/
It can create HTML from the doc strings in your Python source code.
For example, the following code reads two 2-byte integers and one 4-byte integer in big-endian format from a file:
import struct
f = open(filename, "rb") # Open in binary mode for portability s = f.read(8) x, y, z = struct.unpack(">hhl", s)The '>' in the format string forces bin-endian data; the letter 'h' reads one "short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from the string.
For data that is more regular (e.g. a homogeneous list of ints or floats), you can also use the array module, also documented in the library reference.
The most common cause is that the widget to which the binding applies doesn't have "keyboard focus". Check out the Tk documentation for the focus command. Usually a widget is given the keyboard focus by clicking in it (but not for labels; see the taketocus option).
Starting with Python 1.5, the crypt module is disabled by default. In order to enable it, you must go into the Python source tree and edit the file Modules/Setup to enable it (remove a '#' sign in front of the line starting with '#crypt'). Then rebuild. You may also have to add the string '-lcrypt' to that same line.
When freezing Tkinter applications, the applications will not be truly stand-alone, as the application will still need the tcl and tk libraries.
One solution is to ship the application with the tcl and tk libraries, and point to them at run-time using the TCL_LIBRARY and TK_LIBRARY environment variables.
To get truly stand-alone applications, the Tcl scripts that form the library have to be integrated into the application as well. One tool supporting that is SAM (stand-alone modules), which is part of the Tix distribution (http://tix.mne.com). Build Tix with SAM enabled, perform the appropriate call to Tclsam_init etc inside Python's Modules/tkappinit.c, and link with libtclsam and libtksam (you might include the Tix libraries as well).
Static data (in the sense of C++ or Java) is easy; static methods (again in the sense of C++ or Java) are not supported directly.
STATIC DATA
For example,
class C: count = 0 # number of times C.__init__ called
def __init__(self): C.count = C.count + 1
def getcount(self): return C.count # or return self.countc.count also refers to C.count for any c such that isinstance(c, C) holds, unless overridden by c itself or by some class on the base-class search path from c.__class__ back to C.
Caution: within a method of C,
self.count = 42creates a new and unrelated instance vrbl named "count" in self's own dict. So rebinding of a class-static data name needs the
C.count = 314form whether inside a method or not.
STATIC METHODS
Static methods (as opposed to static data) are unnatural in Python, because
C.getcountreturns an unbound method object, which can't be invoked without supplying an instance of C as the first argument.
The intended way to get the effect of a static method is via a module-level function:
def getcount(): return C.countIf your code is structured so as to define one class (or tightly related class hierarchy) per module, this supplies the desired encapsulation.
Several tortured schemes for faking static methods can be found by searching DejaNews. Most people feel such cures are worse than the disease. Perhaps the least obnoxious is due to Pekka Pessi (mailto:[email protected]):
# helper class to disguise function objects class _static: def __init__(self, f): self.__call__ = f
class C: count = 0
def __init__(self): C.count = C.count + 1
def getcount(): return C.count getcount = _static(getcount)
def sum(x, y): return x + y sum = _static(sum)
C(); C() c = C() print C.getcount() # prints 3 print c.getcount() # prints 3 print C.sum(27, 15) # prints 42
__import__('x.y.z').y.zFor more realistic situations, you may have to do something like
m = __import__(s) for i in string.split(s, ".")[1:]: m = getattr(m, i)
import thread def run(name, n): for i in range(n): print name, i for i in range(10): thread.start_new(run, (i, 100))none of the threads seem to run! The reason is that as soon as the main thread exits, all threads are killed.
A simple fix is to add a sleep to the end of the program, sufficiently long for all threads to finish:
import thread, time def run(name, n): for i in range(n): print name, i for i in range(10): thread.start_new(run, (i, 100)) time.sleep(10) # <----------------------------!But now (on many platforms) the threads don't run in parallel, but appear to run sequentially, one at a time! The reason is that the OS thread scheduler doesn't start a new thread until the previous thread is blocked.
A simple fix is to add a tiny sleep to the start of the run function:
import thread, time def run(name, n): time.sleep(0.001) # <---------------------! for i in range(n): print name, i for i in range(10): thread.start_new(run, (i, 100)) time.sleep(10)Some more hints:
Instead of using a time.sleep() call at the end, it's better to use some kind of semaphore mechanism. One idea is to use a the Queue module to create a queue object, let each thread append a token to the queue when it finishes, and let the main thread read as many tokens from the queue as there are threads.
Use the threading module instead of the thread module. It's part of Python since version 1.5.1. It takes care of all these details, and has many other nice features too!
For most file objects f you create in Python via the builtin "open" function, f.close() marks the Python file object as being closed from Python's point of view, and also arranges to close the underlying C stream. This happens automatically too, in f's destructor, when f becomes garbage.
But stdin, stdout and stderr are treated specially by Python, because of the special status also given to them by C: doing
sys.stdout.close() # ditto for stdin and stderrmarks the Python-level file object as being closed, but does not close the associated C stream (provided sys.stdout is still bound to its default value, which is the stream C also calls "stdout").
To close the underlying C stream for one of these three, you should first be sure that's what you really want to do (e.g., you may confuse the heck out of extension modules trying to do I/O). If it is, use os.close:
os.close(0) # close C's stdin stream os.close(1) # close C's stdout stream os.close(2) # close C's stderr stream
A global interpreter lock is used internally to ensure that only one thread runs in the Python VM at a time. In general, Python offers to switch among threads only between bytecode instructions (how frequently it offers to switch can be set via sys.setcheckinterval). Each bytecode instruction-- and all the C implementation code reached from it --is therefore atomic.
In theory, this means an exact accounting requires an exact understanding of the PVM bytecode implementation. In practice, it means that operations on shared vrbls of builtin data types (ints, lists, dicts, etc) that "look atomic" really are.
For example, these are atomic (L, L1, L2 are lists, D, D1, D2 are dicts, x, y are objects, i, j are ints):
L.append(x) L1.extend(L2) x = L[i] x = L.pop() L1[i:j] = L2 L.sort() x = y x.field = y D[x] = y D1.update(D2) D.keys()These aren't:
i = i+1 L.append(L[-1]) L[i] = L[j] D[x] = D[x] + 1Note: operations that replace other objects may invoke those other objects' __del__ method when their reference count reaches zero, and that can affect things. This is especially true for the mass updates to dictionaries and lists. When in doubt, use a mutex!
>>> s = "Hello, world" >>> a = list(s) >>> print a ['H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd'] >>> a[7:] = list("there!") >>> import string >>> print string.join(a, '') 'Hello, there!'
>>> import array >>> a = array.array('c', s) >>> print a array('c', 'Hello, world') >>> a[0] = 'y' ; print a array('c', 'yello world') >>> a.tostring() 'yello, world'
to another?A: Use 'apply', like:
def f1(a, *b, **c): ...
def f2(x, *y, **z): ... z['width']='14.3c' ... apply(f1, (x,)+y, z)
This can be frustrating if you want to save a printable version to a file, make some changes and then compare it with some other printed dictionary. If you have such needs you can subclass UserDict.UserDict to create a SortedDict class that prints itself in a predictable order. Here's one simpleminded implementation of such a class:
import UserDict, string
class SortedDict(UserDict.UserDict): def __repr__(self): result = [] append = result.append keys = self.data.keys() keys.sort() for k in keys: append("%s: %s" % (`k`, `self.data[k]`)) return "{%s}" % string.join(result, ", ")
___str__ = __repr__
This will work for many common situations you might encounter, though it's far from a perfect solution. (It won't have any effect on the pprint module and does not transparently handle values that are or contain dictionaries.
One is to use the freeze tool, which is included in the Python source tree as Tools/freeze. It converts Python byte code to C arrays. Using a C compiler, you can embed all your modules into a new program, which is then linked with the standard Python modules.
On Windows, another alternative exists which does not require a C compiler. Christian Tismer's SQFREEZE (http://starship.python.net/crew/pirx/) appends the byte code to a specially-prepared Python interpreter, which will find the byte code in executable.
Gordon McMillian offers with Installer (http://starship.python.net/crew/gmcm/distribute.html) a third alternative, which works similar to SQFREEZE, but allows to include arbitraty additional files in the stand-alone binary as well.
import termios, TERMIOS, fcntl, FCNTL, sys, os fd = sys.stdin.fileno()
oldterm = termios.tcgetattr(fd) newattr = termios.tcgetattr(fd) newattr[3] = newattr[3] & ~TERMIOS.ICANON & ~TERMIOS.ECHO termios.tcsetattr(fd, TERMIOS.TCSANOW, newattr)
oldflags = fcntl.fcntl(fd,FCNTL.F_GETFL) fcntl.fcntl(fd,FCNTL.F_SETFL,oldflags|FCNTL.O_NONBLOCK)
try: while 1: try: c = sys.stdin.read(1) print "Got character", `c` except IOError: pass # Ignore IOError from empty buff finally: termios.tcsetattr(fd, TERMIOS.TCSAFLUSH, oldterm) fcntl.fcntl(fd,FCNTL.F_SETFL,oldflags)You need the termios and the fcntl module for any of this to work, and I've only tried it on Linux, though it should work elsewhere.
In this code, characters are read and printed one at a time.
termios.tcsetattr() turns off stdin's echoing and disables canonical mode. fcntl.fnctl() is used to obtain stdin's file descriptor flags and modify them for non-blocking mode. Since reading stdin when it is empty results in an IOError, this error is caught and ignored.
There's more information on this in each of the Python books: Programming Python, Internet Programming with Python, and Das Python-Buch (in German).
There is also a high-level API to Python objects which is provided by the so-called 'abstract' interface -- read Include/abstract.h for further details. It allows for example interfacing with any kind of Python sequence (e.g. lists and tuples) using calls like PySequence_Length(), PySequence_GetItem(), etc.) as well as many other useful protocols.
PyObject * PyObject_CallMethod(PyObject *object, char *method_name, char *arg_format, ...);This works for any object that has methods -- whether built-in or user-defined. You are responsible for eventually DECREF'ing the return value.
To call, e.g., a file object's "seek" method with arguments 10, 0 (assuming the file object pointer is "f"):
res = PyObject_CallMethod(f, "seek", "(ii)", 10, 0); if (res == NULL) { ... an exception occurred ... } else { DECREF(res); }Note that since PyObject_CallObject() always wants a tuple for the argument list, to call a function without arguments, pass "()" for the format, and to call a function with one argument, surround the argument in parentheses, e.g. "(i)".
In Python code, define an object that supports the "write()" method. Redirect sys.stdout and sys.stderr to this object. Call print_error, or just allow the standard traceback mechanism to work. Then, the output will go wherever your write() method sends it.
The easiest way to do this is to use the StringIO class in the standard library.
Sample code and use for catching stdout:
>>> class StdoutCatcher: ... def __init__(self): ... self.data = '' ... def write(self, stuff): ... self.data = self.data + stuff ... >>> import sys >>> sys.stdout = StdoutCatcher() >>> print 'foo' >>> print 'hello world!' >>> sys.stderr.write(sys.stdout.data) foo hello world!
module = PyImport_ImportModule("<modulename>");If the module hasn't been imported yet (i.e. it is not yet present in sys.modules), this initializes the module; otherwise it simply returns the value of sys.modules["<modulename>"]. Note that it doesn't enter the module into any namespace -- it only ensures it has been initialized and is stored in sys.modules.
You can then access the module's attributes (i.e. any name defined in the module) as follows:
attr = PyObject_GetAttrString(module, "<attrname>");Calling PyObject_SetAttrString(), to assign to variables in the module, also works.
A useful automated approach (which also works for C) is SWIG: http://www.swig.org/.
Remove lines:
#include "allobjects.h" #include "modsupport.h"And insert instead:
#include "Python.h"You may also need to add
#include "rename2.h"if the module uses "old names".
This may happen with other ancient python modules as well, and the same fix applies.
Every module init function will have a line similar to:
module = Py_InitModule("yourmodule", yourmodule_functions);If the string passed to this function is not the same name as your extenion module, the SystemError will be raised.
In Python you can use the codeop module, which approximates the parser's behavior sufficiently. IDLE uses this, for example.
The easiest way to do it in C is to call PyRun_InteractiveLoop() (in a separate thread maybe) and let the Python interpreter handle the input for you. You can also set the PyOS_ReadlineFunctionPointer to point at your custom input function. See Modules/readline.c and Parser/myreadline.c for more hints.
However sometimes you have to run the embedded Python interpreter in the same thread as your rest application and you can't allow the PyRun_InteractiveLoop() to stop while waiting for user input. The one solution then is to call PyParser_ParseString() and test for e.error equal to E_EOF (then the input is incomplete). Sample code fragment, untested, inspired by code from Alex Farber:
#include <Python.h> #include <node.h> #include <errcode.h> #include <grammar.h> #include <parsetok.h> #include <compile.h>
int testcomplete(char *code) /* code should end in \n */ /* return -1 for error, 0 for incomplete, 1 for complete */ { node *n; perrdetail e;
n = PyParser_ParseString(code, &_PyParser_Grammar, Py_file_input, &e); if (n == NULL) { if (e.error == E_EOF) return 0; return -1; }
PyNode_Free(n); return 1; }Another solution is trying to compile the received string with Py_CompileString(). If it compiles fine - try to execute the returned code object by calling PyEval_EvalCode(). Otherwise save the input for later. If the compilation fails, find out if it's an error or just more input is required - by extracting the message string from the exception tuple and comparing it to the "unexpected EOF while parsing". Here is a complete example using the GNU readline library (you may want to ignore SIGINT while calling readline()):
#include <stdio.h> #include <readline.h>
#include <Python.h> #include <object.h> #include <compile.h> #include <eval.h>
int main (int argc, char* argv[]) { int i, j, done = 0; /* lengths of line, code */ char ps1[] = ">>> "; char ps2[] = "... "; char *prompt = ps1; char *msg, *line, *code = NULL; PyObject *src, *glb, *loc; PyObject *exc, *val, *trb, *obj, *dum;
Py_Initialize (); loc = PyDict_New (); glb = PyDict_New (); PyDict_SetItemString (glb, "__builtins__", PyEval_GetBuiltins ());
while (!done) { line = readline (prompt);
if (NULL == line) /* CTRL-D pressed */ { done = 1; } else { i = strlen (line);
if (i > 0) add_history (line); /* save non-empty lines */
if (NULL == code) /* nothing in code yet */ j = 0; else j = strlen (code);
code = realloc (code, i + j + 2); if (NULL == code) /* out of memory */ exit (1);
if (0 == j) /* code was empty, so */ code[0] = '\0'; /* keep strncat happy */
strncat (code, line, i); /* append line to code */ code[i + j] = '\n'; /* append '\n' to code */ code[i + j + 1] = '\0';
src = Py_CompileString (code, "<stdin>", Py_single_input);
if (NULL != src) /* compiled just fine - */ { if (ps1 == prompt || /* ">>> " or */ '\n' == code[i + j - 1]) /* "... " and double '\n' */ { /* so execute it */ dum = PyEval_EvalCode ((PyCodeObject *)src, glb, loc); Py_XDECREF (dum); Py_XDECREF (src); free (code); code = NULL; if (PyErr_Occurred ()) PyErr_Print (); prompt = ps1; } } /* syntax error or E_EOF? */ else if (PyErr_ExceptionMatches (PyExc_SyntaxError)) { PyErr_Fetch (&exc, &val, &trb); /* clears exception! */
if (PyArg_ParseTuple (val, "sO", &msg, &obj) && !strcmp (msg, "unexpected EOF while parsing")) /* E_EOF */ { Py_XDECREF (exc); Py_XDECREF (val); Py_XDECREF (trb); prompt = ps2; } else /* some other syntax error */ { PyErr_Restore (exc, val, trb); PyErr_Print (); free (code); code = NULL; prompt = ps1; } } else /* some non-syntax error */ { PyErr_Print (); free (code); code = NULL; prompt = ps1; }
free (line); } }
Py_XDECREF(glb); Py_XDECREF(loc); Py_Finalize(); exit(0); }
In your .gdbinit file (or interactively), add the command
br _PyImport_LoadDynamicModule
$ gdb /local/bin/python
gdb) run myscript.py
gdb) continue # repeat until your extension is loaded
gdb) finish # so that your extension is loaded
gdb) br myfunction.c:50
gdb) continue
Since there are no begin/end brackets there cannot be a disagreement between grouping perceived by the parser and the human reader. I remember long ago seeing a C fragment like this:
if (x <= y) x++; y--; z++;and staring a long time at it wondering why y was being decremented even for x > y... (And I wasn't a C newbie then either.)
Since there are no begin/end brackets, Python is much less prone to coding-style conflicts. In C there are loads of different ways to place the braces (including the choice whether to place braces around single statements in certain cases, for consistency). If you're used to reading (and writing) code that uses one style, you will feel at least slightly uneasy when reading (or being required to write) another style. Many coding styles place begin/end brackets on a line by themself. This makes programs considerably longer and wastes valuable screen space, making it harder to get a good overview over a program. Ideally, a function should fit on one basic tty screen (say, 20 lines). 20 lines of Python are worth a LOT more than 20 lines of C. This is not solely due to the lack of begin/end brackets (the lack of declarations also helps, and the powerful operations of course), but it certainly helps!
First, it makes it more obvious that you are using a method or instance attribute instead of a local variable. Reading "self.x" or "self.meth()" makes it absolutely clear that an instance variable or method is used even if you don't know the class definition by heart. In C++, you can sort of tell by the lack of a local variable declaration (assuming globals are rare or easily recognizable) -- but in Python, there are no local variable declarations, so you'd have to look up the class definition to be sure.
Second, it means that no special syntax is necessary if you want to explicitly reference or call the method from a particular class. In C++, if you want to use a method from base class that is overridden in a derived class, you have to use the :: operator -- in Python you can write baseclass.methodname(self, <argument list>). This is particularly useful for __init__() methods, and in general in cases where a derived class method wants to extend the base class method of the same name and thus has to call the base class method somehow.
Lastly, for instance variables, it solves a syntactic problem with assignment: since local variables in Python are (by definition!) those variables to which a value assigned in a function body (and that aren't explicitly declared global), there has to be some way to tell the interpreter that an assignment was meant to assign to an instance variable instead of to a local variable, and it should preferably be syntactic (for efficiency reasons). C++ does this through declarations, but Python doesn't have declarations and it would be a pity having to introduce them just for this purpose. Using the explicit "self.var" solves this nicely. Similarly, for using instance variables, having to write "self.var" means that references to unqualified names inside a method don't have to search the instance's directories.
Answer 2: Fortunately, there is Stackless Python, which has a completely redesigned interpreter loop that avoids the C stack. It's still experimental but looks very promising. Although it is binary compatible with standard Python, it's still unclear whether Stackless will make it into the core -- maybe it's just too revolutionary. Stackless Python currently lives here: http://www.stackless.com. A microthread implementation that uses it can be found here: http://world.std.com/~wware/uthread.html.
However, in Python, this is not a serious problem. Unlike lambda forms in other languages, where they add functionality, Python lambdas are only a shorthand notation if you're too lazy to define a function.
Functions are already first class objects in Python, and can be declared in a local scope. Therefore the only advantage of using a lambda form instead of a locally-defined function is that you don't need to invent a name for the function -- but that's just a local variable to which the function object (which is exactly the same type of object that a lambda form yields) is assigned!
def test(): class factorial: def __call__(self, n): if n<=1: return 1 return n * self(n-1) return factorial()
fact = test()The instance created by factorial() above acts like the recursive factorial function.
Mutually recursive functions can be passed to each other as arguments.
A call to dict.keys() makes one fast scan over the dictionary (internally, the iteration function does exist) copying the pointers to the key objects into a pre-allocated list object of the right size. The iteration time isn't lost (since you'll have to iterate anyway -- unless in the majority of cases your loop terminates very prematurely (which I doubt since you're getting the keys in random order).
I don't expose the dictionary iteration operation to Python programmers because the dictionary shouldn't be modified during the entire iteration -- if it is, there's a small chance that the dictionary is reorganized because the hash table becomes too full, and then the iteration may miss some items and see others twice. Exactly because this only occurs rarely, it would lead to hidden bugs in programs: it's easy never to have it happen during test runs if you only insert or delete a few items per iteration -- but your users will surely hit upon it sooner or later.
Several projects described in the Python newsgroup or at past Python conferences have shown that this approach is feasible, although the speedups reached so far are only modest (e.g. 2x). JPython uses the same strategy for compiling to Java bytecode. (Jim Hugunin has demonstrated that in combination with whole-program analysis, speedups of 1000x are feasible for small demo programs. See the website for the 1997 Python conference.)
Internally, Python source code is always translated into a "virtual machine code" or "byte code" representation before it is interpreted (by the "Python virtual machine" or "bytecode interpreter"). In order to avoid the overhead of parsing and translating modules that rarely change over and over again, this byte code is written on a file whose name ends in ".pyc" whenever a module is parsed (from a file whose name ends in ".py"). When the corresponding .py file is changed, it is parsed and translated again and the .pyc file is rewritten.
There is no performance difference once the .pyc file has been loaded (the bytecode read from the .pyc file is exactly the same as the bytecode created by direct translation). The only difference is that loading code from a .pyc file is faster than parsing and translating a .py file, so the presence of precompiled .pyc files will generally improve start-up time of Python scripts. If desired, the Lib/compileall.py module/script can be used to force creation of valid .pyc files for a given set of modules.
Note that the main script executed by Python, even if its filename ends in .py, is not compiled to a .pyc file. It is compiled to bytecode, but the bytecode is not saved to a file.
If you are looking for a way to translate Python programs in order to distribute them in binary form, without the need to distribute the interpreter and library as well, have a look at the freeze.py script in the Tools/freeze directory. This creates a single binary file incorporating your program, the Python interpreter, and those parts of the Python library that are needed by your program. Of course, the resulting binary will only run on the same type of platform as that used to create it.
On the other hand, JPython relies on the Java runtime; so it uses the JVM's garbage collector. This difference can cause some subtle porting problems if your Python code depends on the behavior of the reference counting implementation.
Two exceptions to bear in mind for standard Python are:
1) if the object lies on a circular reference path it won't be freed unless the circularities are broken. EG:
List = [None] List[0] = ListList will not be freed unless the circularity (List[0] is List) is broken. The reason List will not be freed is because although it may become inaccessible the list contains a reference to itself, and reference counting only deallocates an object when all references to an object are destroyed. To break the circular reference path we must destroy the reference, as in
List[0] = NoneSo, if your program creates circular references (and if it is long running and/or consumes lots of memory) it may have to do some explicit management of circular structures. In many application domains this is needed rarely, if ever.
CPython 2.0 fixes this problem by periodically executing a cycle detection algorithm which looks for inaccessible cycles and deletes the objects involved. A new gc module provides functions to perform a garbage collection, obtain debugging statistics, and tuning the collector's parameters.
Running the cycle detection algorithm takes some time, and therefore will result in some additional overhead. It is hoped that after we've gotten experience with the cycle collection from using 2.0, Python 2.1 will be able to minimize the overhead with careful tuning. It's not yet obvious how much performance is lost, because benchmarking this is tricky and depends crucially on how often the program creates and destroys objects. The detection of cycles can be disabled when Python is compiled, if you can't afford even a tiny speed penalty or suspect that the cycle collection is buggy, by specifying the "-without-cycle-gc" switch when running the configure script.
2) Sometimes objects get stuck in "tracebacks" temporarily and hence are not deallocated when you might expect. Clear the tracebacks via
import sys sys.exc_traceback = sys.last_traceback = NoneTracebacks are used for reporting errors and implementing debuggers and related things. They contain a portion of the program state extracted during the handling of an exception (usually the most recent exception).
In the absence of circularities and modulo tracebacks, Python programs need not explicitly manage memory.
It is often suggested that Python could benefit from fully general garbage collection. It's looking less and less likely that Python will ever get "automatic" garbage collection (GC). For one thing, unless this were added to C as a standard feature, it's a portability pain in the ass. And yes, I know about the Xerox library. It has bits of assembler code for most common platforms. Not for all. And although it is mostly transparent, it isn't completely transparent (when I once linked Python with it, it dumped core).
"Proper" GC also becomes a problem when Python gets embedded into other applications. While in a stand-alone Python it may be fine to replace the standard malloc() and free() with versions provided by the GC library, an application embedding Python may want to have its own substitute for malloc() and free(), and may not want Python's. Right now, Python works with anything that implements malloc() and free() properly.
In JPython, which has garbage collection, the following code (which is fine in C Python) will probably run out of file descriptors long before it runs out of memory:
for file in <very long list of files>: f = open(file) c = f.read(1)Using the current reference counting and destructor scheme, each new assignment to f closes the previous file. Using GC, this is not guaranteed. Sure, you can think of ways to fix this. But it's not off-the-shelf technology. If you want to write code that will work with any Python implementation, you should explicitly close the file; this will work regardless of GC:
for file in <very long list of files>: f = open(file) c = f.read(1) f.close()
All that said, somebody has managed to add GC to Python using the GC library fromn Xerox, so you can see for yourself. See
http://starship.python.net/crew/gandalf/gc-ss.htmlSee also question 4.17 for ways to plug some common memory leaks manually.
If you're not satisfied with the answers here, before you post to the newsgroup, please read this summary of past discussions on GC for Python by Moshe Zadka:
http://www.geocities.com/TheTropics/Island/2932/gcpy.html
Immutable tuples are useful in situations where you need to pass a few items to a function and don't want the function to modify the tuple; for example,
point1 = (120, 140) point2 = (200, 300) record(point1, point2) draw(point1, point2)You don't want to have to think about what would happen if record() changed the coordinates -- it can't, because the tuples are immutable.
On the other hand, when creating large lists dynamically, it is absolutely crucial that they are mutable -- adding elements to a tuple one by one requires using the concatenation operator, which makes it quadratic in time.
As a general guideline, use tuples like you would use structs in C or records in Pascal, use lists like (variable length) arrays.
This makes indexing a list (a[i]) an operation whose cost is independent of the size of the list or the value of the index.
When items are appended or inserted, the array of references is resized. Some cleverness is applied to improve the performance of appending items repeatedly; when the array must be grown, some extra space is allocated so the next few times don't require an actual resize.
Compared to B-trees, this gives better performance for lookup (the most common operation by far) under most circumstances, and the implementation is simpler.
If you think you need to have a dictionary indexed with a list, try to use a tuple instead. The function tuple(l) creates a tuple with the same entries as the list l.
Some unacceptable solutions that have been proposed:
- Hash lists by their address (object ID). This doesn't work because if you construct a new list with the same value it won't be found; e.g.,
d = {[1,2]: '12'} print d[[1,2]]will raise a KeyError exception because the id of the [1,2] used in the second line differs from that in the first line. In other words, dictionary keys should be compared using '==', not using 'is'.
- Make a copy when using a list as a key. This doesn't work because the list (being a mutable object) could contain a reference to itself, and then the copying code would run into an infinite loop.
- Allow lists as keys but tell the user not to modify them. This would allow a class of hard-to-track bugs in programs that I'd rather not see; it invalidates an important invariant of dictionaries (every value in d.keys() is usable as a key of the dictionary).
- Mark lists as read-only once they are used as a dictionary key. The problem is that it's not just the top-level object that could change its value; you could use a tuple containing a list as a key. Entering anything as a key into a dictionary would require marking all objects reachable from there as read-only -- and again, self-referential objects could cause an infinite loop again (and again and again).
There is a trick to get around this if you need to, but use it at your own risk: You can wrap a mutable structure inside a class instance which has both a __cmp__ and a __hash__ method.
class listwrapper: def __init__(self, the_list): self.the_list = the_list def __cmp__(self, other): return self.the_list == other.the_list def __hash__(self): l = self.the_list result = 98767 - len(l)*555 for i in range(len(l)): try: result = result + (hash(l[i]) % 9999999) * 1001 + i except: result = (result % 7777777) + i * 333 return resultNote that the hash computation is complicated by the possibility that some members of the list may be unhashable and also by the possibility of arithmetic overflow.
You must make sure that the hash value for all such wrapper objects that reside in a dictionary (or other hash based structure), remain fixed while the object is in the dictionary (or other structure).
Furthermore it must always be the case that if o1 == o2 (ie o1.__cmp__(o2)==0) then hash(o1)==hash(o2) (ie, o1.__hash__() == o2.__hash__()), regardless of whether the object is in a dictionary or not. If you fail to meet these restrictions dictionaries and other hash based structures may misbehave!
In the case of listwrapper above whenever the wrapper object is in a dictionary the wrapped list must not change to avoid anomalies. Don't do this unless you are prepared to think hard about the requirements and the consequences of not meeting them correctly. You've been warned!
Lists are arrays in the C or Pascal sense of the word (see question 6.16). The array module also provides methods for creating arrays of fixed types with compact representations (but they are slower to index than lists). Also note that the Numerics extensions and others define array-like structures with various characteristics as well.
To get Lisp-like lists, emulate cons cells
lisp_list = ("like", ("this", ("example", None) ) )using tuples (or lists, if you want mutability). Here the analogue of lisp car is lisp_list[0] and the analogue of cdr is lisp_list[1]. Only do this if you're sure you really need to (it's usually a lot slower than using Python lists).
Think of Python lists as mutable heterogeneous arrays of Python objects (say that 10 times fast :) ).
As a result, here's the idiom to iterate over the keys of a dictionary in sorted order:
keys = dict.keys() keys.sort() for key in keys: ...do whatever with dict[key]...
A good test suite for a module can at once provide a regression test and serve as a module interface specification (even better since it also gives example usage). Look to many of the standard libraries which often have a "script interpretation" which provides a simple "self test." Even modules which use complex external interfaces can often be tested in isolation using trivial "stub" emulations of the external interface.
An appropriate testing discipline (if enforced) can help build large complex applications in Python as well as having interface specifications would do (or better). Of course Python allows you to get sloppy and not do it. Also you might want to design your code with an eye to make it easily tested.
Remember that in Python usage "type" refers to a C implementation of an object. To distinguish among instances of different classes use Instance.__class__, and also look to 4.47. Sorry for the terminological confusion, but at this point in Python's development nothing can be done!
This may happen if there are circular references (see question 4.17). There are also certain bits of memory that are allocated by the C library that are impossible to free (e.g. a tool like Purify will complain about these).
But in general, Python 1.5 and beyond (in contrast with earlier versions) is quite agressive about cleaning up memory on exit.
If you want to force Python to delete certain things on deallocation use the sys.exitfunc hook to force those deletions. For example if you are debugging an extension module using a memory analysis tool and you wish to make Python deallocate almost everything you might use an exitfunc like this one:
import sys
def my_exitfunc(): print "cleaning up" import sys # do order dependant deletions here ... # now delete everything else in arbitrary order for x in sys.modules.values(): d = x.__dict__ for name in d.keys(): del d[name]
sys.exitfunc = my_exitfuncOther exitfuncs can be less drastic, of course.
(In fact, this one just does what Python now already does itself; but the example of using sys.exitfunc to force cleanups is still useful.)
instance.attribute(arg1, arg2)usually translates to the equivalent of
Class.attribute(instance, arg1, arg2)where Class is a (super)class of instance. Similarly
instance.attribute = valuesets an attribute of an instance (overriding any attribute of a class that instance inherits).
Sometimes programmers want to have different behaviours -- they want a method which does not bind to the instance and a class attribute which changes in place. Python does not preclude these behaviours, but you have to adopt a convention to implement them. One way to accomplish this is to use "list wrappers" and global functions.
def C_hello(): print "hello"
class C: hello = [C_hello] counter = [0]
I = C()Here I.hello[0]() acts very much like a "class method" and I.counter[0] = 2 alters C.counter (and doesn't override it). If you don't understand why you'd ever want to do this, that's because you are pure of mind, and you probably never will want to do it! This is dangerous trickery, not recommended when avoidable. (Inspired by Tim Peter's discussion.)
Because of this feature it is good programming practice not to use mutable objects as default values, but to introduce them in the function. Don't write:
def foo(dict={}): # XXX shared reference to one dict for all calls ...but:
def foo(dict=None): if dict is None: dict = {} # create a new dict for local namespaceSee page 182 of "Internet Programming with Python" for one discussion of this feature. Or see the top of page 144 or bottom of page 277 in "Programming Python" for another discussion.
class label: pass # declare a label try: ... if (condition): raise label() # goto label ... except label: # where to goto pass ...This doesn't allow you to jump into the middle of a loop, but that's usually considered an abuse of goto anyway. Use sparingly.
def linear(a,b): def result(x, a=a, b=b): return a*x + b return resultOr using callable objects:
class linear: def __init__(self, a, b): self.a, self.b = a,b def __call__(self, x): return self.a * x + self.bIn both cases:
taxes = linear(0.3,2)gives a callable object where taxes(10e6) == 0.3 * 10e6 + 2.
The defaults strategy has the disadvantage that the default arguments could be accidentally or maliciously overridden. The callable objects approach has the disadvantage that it is a bit slower and a bit longer. Note however that a collection of callables can share their signature via inheritance. EG
class exponential(linear): # __init__ inherited def __call__(self, x): return self.a * (x ** self.b)On comp.lang.python, [email protected] points out that an object can encapsulate state for several methods in order to emulate the "closure" concept from functional programming languages, for example:
class counter: value = 0 def set(self, x): self.value = x def up(self): self.value=self.value+1 def down(self): self.value=self.value-1
count = counter() inc, dec, reset = count.up, count.down, count.setHere inc, dec and reset act like "functions which share the same closure containing the variable count.value" (if you like that way of thinking).
Note that JPython doesn't have this restriction!
Raw strings were designed to ease creating input for processors (chiefly regular expression engines) that want to do their own backslash escape processing. Such processors consider an unmatched trailing backslash to be an error anyway, so raw strings disallow that. In return, they allow you to pass on the string quote character by escaping it with a backslash. These rules work well when r-strings are used for their intended purpose.
If you're trying to build Windows pathnames, note that all Windows system calls accept forward slashes too:
f = open("/mydir/file.txt") # works fine!If you're trying to build a pathname for a DOS command, try e.g. one of
dir = r"\this\is\my\dos\dir" "\\" dir = r"\this\is\my\dos\dir\ "[:-1] dir = "\\this\\is\\my\\dos\\dir\\"
while (line = readline(f)) { ...do something with line... }where in Python you're forced to write this:
while 1: line = f.readline() if not line: break ...do something with line...This issue comes up in the Python newsgroup with alarming frequency -- search Deja News for past messages about assignment expression. The reason for not allowing assignment in Python expressions is a common, hard-to-find bug in those other languages, caused by this construct:
if (x = 0) { ...error handling... } else { ...code that only works for nonzero x... }Many alternatives have been proposed. Most are hacks that save some typing but use arbitrary or cryptic syntax or keywords, and fail the simple criterion that I use for language change proposals: it should intuitively suggest the proper meaning to a human reader who has not yet been introduced with the construct.
The earliest time something can be done about this will be with Python 2.0 -- if it is decided that it is worth fixing. An interesting phenomenon is that most experienced Python programmers recognize the "while 1" idiom and don't seem to be missing the assignment in expression construct much; it's only the newcomers who express a strong desire to add this to the language.
One fairly elegant solution would be to introduce a new operator for assignment in expressions spelled ":=" -- this avoids the "=" instead of "==" problem. It would have the same precedence as comparison operators but the parser would flag combination with other comparisons (without disambiguating parentheses) as an error.
Finally -- there's an alternative way of spelling this that seems attractive but is generally less robust than the "while 1" solution:
line = f.readline() while line: ...do something with line... line = f.readline()The problem with this is that if you change your mind about exactly how you get the next line (e.g. you want to change it into sys.stdin.readline()) you have to remember to change two places in your program -- the second one hidden at the bottom of the loop.
Most windows extensions can be found (or referenced) at http://www.python.org/windows/
Windows 3.1/DOS support seems to have dropped off recently. You may need to settle for an old version of Python one these platforms. One such port is WPY
WPY: Ports to DOS, Windows 3.1(1), Windows 95, Windows NT and OS/2. Also contains a GUI package that offers portability between Windows (not DOS) and Unix, and native look and feel on both. ftp://ftp.python.org/pub/python/wpy/.
Uwe Zessin has ported Python 1.5.x to OpenVMS. See http://decus.decus.de/~zessin/.
But if you are sure you have the only distribution with a hope of working on your system, then...
You still need to copy the files from the distribution directory "python/Lib" to your system. If you don't have the full distribution, you can get the file lib<version>.tar.gz from most ftp sites carrying Python; this is a subset of the distribution containing just those files, e.g. ftp://ftp.python.org/pub/python/src/lib1.4.tar.gz.
Once you have installed the library, you need to point sys.path to it. Assuming the library is in C:\misc\python\lib, the following commands will point your Python interpreter to it (note the doubled backslashes -- you can also use single forward slashes instead):
>>> import sys >>> sys.path.insert(0, 'C:\\misc\\python\\lib') >>>For a more permanent effect, set the environment variable PYTHONPATH, as follows (talking to a DOS prompt):
C> SET PYTHONPATH=C:\misc\python\lib
Regarding the same question for the PC, Kurt Wm. Hemr writes: "While anyone with a pulse could certainly figure out how to do the same on MS-Windows, I would recommend the NotGNU Emacs clone for MS-Windows. Not only can you easily resave and "reload()" from Python after making changes, but since WinNot auto-copies to the clipboard any text you select, you can simply select the entire procedure (function) which you changed in WinNot, switch to QWPython, and shift-ins to reenter the changed program unit."
If you're using Windows95 or Windows NT, you should also know about PythonWin, which provides a GUI framework, with an mouse-driven editor, an object browser, and a GUI-based debugger. See
http://www.python.org/ftp/python/pythonwin/for details.
http://www.python.org/download/download_windows.htmlOne warning: don't attempt to use Tkinter from PythonWin (Mark Hammond's IDE). Use it from the command line interface (python.exe) or the windowless interpreter (pythonw.exe).
"...\python.exe -u ..."for the cgi execution. The -u (unbuffered) option on NT and win95 prevents the interpreter from altering newlines in the standard input and output. Without it post/multipart requests will seem to have the wrong length and binary (eg, GIF) responses may get garbled (resulting in, eg, a "broken image").
You should use the win32pipe module's popen() instead which doesn't depend on having an attached Win32 console.
Example:
import win32pipe f = win32pipe.popen('dir /c c:\\') print f.readlines() f.close()
import sys if sys.platform == "win32": import win32pipe popen = win32pipe.popen else: import os popen = os.popen(See FAQ 7.13 for an explanation of why you might want to do something like this.) Also you can try to import a module and use a fallback if the import fails:
try: import really_fast_implementation choice = really_fast_implementation except ImportError: import slower_implementation choice = slower_implementation
On the Microsoft IIS server or on the Win95 MS Personal Web Server you set up python in the same way that you would set up any other scripting engine.
Run regedt32 and go to:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W3SVC\Parameters\ScriptMap
and enter the following line (making any specific changes that your system may need)
.py :REG_SZ: c:\<path to python>\python.exe -u %s %s
This line will allow you to call your script with a simple reference like: http://yourserver/scripts/yourscript.py provided "scripts" is an "executable" directory for your server (which it usually is by default). The "-u" flag specifies unbuffered and binary mode for stdin - needed when working with binary data
In addition, it is recommended by people who would know that using ".py" may not be a good idea for the file extensions when used in this context (you might want to reserve *.py for support modules and use *.cgi or *.cgp for "main program" scripts). However, that issue is beyond this Windows FAQ entry.
Netscape Servers: Information on this topic exists at: http://home.netscape.com/comprod/server_central/support/fasttrack_man/programs.htm#1010870
(Search for "keypress" to find an answer for Unix as well.)
http://www.python.org/doc/essays/styleguide.htmlUnder any editor mixing tabs and spaces is a bad idea. MSVC is no different in this respect, and is easily configured to use spaces: Take Tools -> Options -> Tabs, and for file type "Default" set "Tab size" and "Indent size" to 4, and select the "Insert spaces" radio button.
If you suspect mixed tabs and spaces are causing problems in leading whitespace, run Python with the -t switch or, run Tools/Scripts/tabnanny.py to check a directory tree in batch mode.
def kill(pid): """kill function for Win32""" import win32api handle = win32api.OpenProcess(1, 0, pid) return (0 != win32api.TerminateProcess(handle, 0))
>>> import os >>> os.path.isdir( '\\\\rorschach\\public') 0 >>> os.path.isdir( '\\\\rorschach\\public\\') 1[Blake Winton responds:] I've had the same problem doing "Start >> Run" and then a directory on a shared drive. If I use "\\rorschach\public", it will fail, but if I use "\\rorschach\public\", it will work. For that matter, os.stat() does the same thing (well, it gives an error for "\\\\rorschach\\public", but you get the idea)...
I've got a theory about why this happens, but it's only a theory. NT knows the difference between shared directories, and regular directories. "\\rorschach\public" isn't a directory, it's _really_ an IPC abstraction. This is sort of lended credence to by the fact that when you're mapping a network drive, you can't map "\\rorschach\public\utils", but only "\\rorschach\public".
[Clarification by [email protected]] It's not actually a Python question, as Python is working just fine; it's clearing up something a bit muddled about Windows networked drives.
It helps to think of share points as being like drive letters. Example:
k: is not a directory k:\ is a directory k:\media is a directory k:\media\ is not a directoryThe same rules apply if you substitute "k:" with "\\conky\foo":
\\conky\foo is not a directory \\conky\foo\ is a directory \\conky\foo\media is a directory \\conky\foo\media\ is not a directory
I think this happens because the application was compiled with a different set of compiler flags than Python15.DLL. It seems that some compiler flags affect the standard I/O library in such a way that using different flags makes calls fail. You need to set it for the non-debug multi-threaded DLL (/MD on the command line, or can be set via MSVC under Project Settings->C++/Code Generation then the "Use rum-time library" dropdown.)
Also note that you can not mix-and-match Debug and Release versions. If you wish to use the Debug Multithreaded DLL, then your module _must_ have an "_d" appended to the base name.
ImportError: DLL load failed: One of the library files needed to run this application cannot be found.It could be that you haven't installed Tcl/Tk, but if you did install Tcl/Tk, and the Wish application works correctly, the problem may be that its installer didn't manage to edit the autoexec.bat file correctly. It tries to add a statement that changes the PATH environment variable to include the Tcl/Tk 'bin' subdirectory, but sometimes this edit doesn't quite work. Opening it with notepad usually reveals what the problem is.
(One additional hint, noted by David Szafranski: you can't use long filenames here; e.g. use C:\PROGRA~1\Tcl\bin instead of C:\Program Files\Tcl\bin.)
Simply rename the downloaded file to have the .TGZ extension, and WinZip will be able to handle it. (If your copy of WinZip doesn't, get a newer one from http://www.winzip.com.)
The Python 1.5.* DLLs (python15.dll) are all compiled with MS VC++ 5.0 and with multithreading-DLL options (/MD, I think).
If you can't change compilers or flags, try using Py_RunSimpleString(). A trick to get it to run an arbitrary file is to construct a call to execfile() with the name of your file as argument.
You can use freeze on Windows, but you must download the source tree (see http://www.python.org/download/download_source.html). This is recommended for Python 1.5.2 (and betas thereof) only; older versions don't quite work.
You need the Microsoft VC++ 5.0 compiler (maybe it works with 6.0 too). You probably need to build Python -- the project files are all in the PCbuild directory.
The freeze program is in the Tools\freeze subdirectory of the source tree.
Note that the search path for foo.pyd is PYTHONPATH, not the same as the path that Windows uses to search for foo.dll. Also, foo.pyd need not be present to run your program, whereas if you linked your program with a dll, the dll is required. Of course, foo.pyd is required if you want to say "import foo". In a dll, linkage is declared in the source code with __declspec(dllexport). In a .pyd, linkage is defined in a list of available functions.
Cause: you have an old Tcl/Tk DLL built with cygwin in your path (probably C:\Windows). You must use the Tcl/Tk DLLs from the standard Tcl/Tk installation (Python 1.5.2 comes with one).
Win2K:
The standard installer already associates the .py extension with a file type (Python.File) and gives that file type an open command that runs the interpreter (D:\Program Files\Python\python.exe "%1" %*). This is enough to make scripts executable from the command prompt as 'foo.py'. If you'd rather be able to execute the script by simple typing 'foo' with no extension you need to add .py to the PATHEXT environment variable.
WinNT:
The steps taken by the installed as described above allow you do run a script with 'foo.py', but a long time bug in the NT command processor prevents you from redirecting the input or output of any script executed in this way. This is often important.
An appropriate incantation for making a Python script executable under WinNT is to give the file an extension of .cmd and add the following as the first line:
@setlocal enableextensions & python -x %~f0 %* & goto :EOFWin9x:
[Due to Bruce Eckel]
@echo off rem = """ rem run python on this bat file. Needs the full path where rem you keep your python files. The -x causes python to skip rem the first line of the file: python -x c:\aaa\Python\\"%0".bat %1 %2 %3 %4 %5 %6 %7 %8 %9 goto endofpython rem """
# The python program goes here:
print "hello, Python"
# For the end of the batch file: rem = """ :endofpython rem """
This version uses CTL3D32.DLL whitch is not the correct version. This version is used for windows NT applications only.[Tim Peters] This is a Microsoft DLL, and a notorious source of problems. The msg means what it says: you have the wrong version of this DLL for your operating system. The Python installation did not cause this -- something else you installed previous to this overwrote the DLL that came with your OS (probably older shareware of some sort, but there's no way to tell now). If you search for "CTL3D32" using any search engine (AltaVista, for example), you'll find hundreds and hundreds of web pages complaining about the same problem with all sorts of installation programs. They'll point you to ways to get the correct version reinstalled on your system (since Python doesn't cause this, we can't fix it).
David A Burton has written a little program to fix this. Go to http://www.burtonsys.com/download.html and click on "ctl3dfix.zip"
Embedding the Python interpreter in a Windows app can be summarized as follows:
1. Do _not_ build Python into your .exe file directly. On Windows, Python must be a DLL to handle importing modules that are themselves DLL's. (This is the first key undocumented fact.) Instead, link to python15.dll; it is typically installed in c:\Windows\System.
You can link to Python statically or dynamically. Linking statically means linking against python15.lib The drawback is that your app won't run if python15.dll does not exist on your system.
General note: python15.lib is the so-called "import lib" corresponding to python.dll. It merely defines symbols for the linker.
Borland note: convert python15.lib to OMF format using Coff2Omf.exe first.
Linking dynamically greatly simplifies link options; everything happens at run time. Your code must load python15.dll using the Windows LoadLibraryEx routine. The code must also use access routines and data in python15.dll (that is, Python's C API's) using pointers obtained by the Windows GetProcAddress routine. Macros can make using these pointers transparent to any C code that calls routines in Python's C API.
2. If you use SWIG, it is easy to create a Python "extension module" that will make the app's data and methods available to Python. SWIG will handle just about all the grungy details for you. The result is C code that you link _into your .exe file_ (!) You do _not_ have to create a DLL file, and this also simplifies linking.
3. SWIG will create an init function (a C function) whose name depends on the name of the extension module. For example, if the name of the module is leo, the init function will be called initleo(). If you use SWIG shadow classes, as you should, the init function will be called initleoc(). This initializes a mostly hidden helper class used by the shadow class.
The reason you can link the C code in step 2 into your .exe file is that calling the initialization function is equivalent to importing the module into Python! (This is the second key undocumented fact.)
4. In short, you can use the following code to initialize the Python interpreter with your extension module.
#include "python.h" ... Py_Initialize(); // Initialize Python. initmyAppc(); // Initialize (import) the helper class. PyRun_SimpleString("import myApp") ; // Import the shadow class.5. There are two problems with Python's C API which will become apparent if you use a compiler other than MSVC, the compiler used to build python15.dll.
Problem 1: The so-called "Very High Level" functions that take FILE * arguments will not work in a multi-compiler environment; each compiler's notion of a struct FILE will be different. Warnings should be added to the Python documentation! From an implementation standpoint these are very _low_ level functions.
Problem 2: SWIG generates the following code when generating wrappers to void functions:
Py_INCREF(Py_None); _resultobj = Py_None; return _resultobj;Alas, Py_None is a macro that expands to a reference to a complex data structure called _Py_NoneStruct inside python15.dll. Again, this code will fail in a mult-compiler environment. Replace such code by:
return Py_Build("");It may be possible to use SWIG's %typemap command to make the change automatically, though I have not been able to get this to work (I'm a complete SWIG newbie.)
6. Using a Python shell script to put up a Python interpreter window from inside your Windows app is not a good idea; the resulting window will be independent of your app's windowing system. Rather, you (or the wxPythonWindow class) should create a "native" interpreter window. It is easy to connect that window to the Python interpreter. You can redirect Python's i/o to _any_ object that supports read and write, so all you need is a Python object (defined in your extension module) that contains read and write methods.