Skip to content

Commit

Permalink
SF patch 576101, by Oren Tirosh: alternative implementation of
Browse files Browse the repository at this point in the history
interning.  I modified Oren's patch significantly, but the basic idea
and most of the implementation is unchanged.  Interned strings created
with PyString_InternInPlace() are now mortal, and you must keep a
reference to the resulting string around; use the new function
PyString_InternImmortal() to create immortal interned strings.
  • Loading branch information
gvanrossum committed Aug 19, 2002
1 parent d8dbf84 commit 45ec02a
Show file tree
Hide file tree
Showing 7 changed files with 171 additions and 106 deletions.
6 changes: 4 additions & 2 deletions Doc/lib/libfuncs.tex
Original file line number Diff line number Diff line change
Expand Up @@ -518,8 +518,10 @@ \section{Built-in Functions \label{built-in-funcs}}
be done by a pointer compare instead of a string compare. Normally,
the names used in Python programs are automatically interned, and
the dictionaries used to hold module, class or instance attributes
have interned keys. Interned strings are immortal (never get
garbage collected).
have interned keys. \versionchanged[Interned strings are not
immortal (like they used to be in Python 2.2 and before);
you must keep a reference to the return value of \function{intern()}
around to benefit from it]{2.3}
\end{funcdesc}

\begin{funcdesc}{isinstance}{object, classinfo}
Expand Down
7 changes: 5 additions & 2 deletions Include/modsupport.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ PyAPI_FUNC(int) PyModule_AddObject(PyObject *, char *, PyObject *);
PyAPI_FUNC(int) PyModule_AddIntConstant(PyObject *, char *, long);
PyAPI_FUNC(int) PyModule_AddStringConstant(PyObject *, char *, char *);

#define PYTHON_API_VERSION 1011
#define PYTHON_API_STRING "1011"
#define PYTHON_API_VERSION 1012
#define PYTHON_API_STRING "1012"
/* The API version is maintained (independently from the Python version)
so we can detect mismatches between the interpreter and dynamically
loaded modules. These are diagnosed by an error message but
Expand All @@ -38,6 +38,9 @@ PyAPI_FUNC(int) PyModule_AddStringConstant(PyObject *, char *, char *);
Please add a line or two to the top of this log for each API
version change:
19-Aug-2002 GvR 1012 Changes to string object struct for
interning changes, saving 3 bytes.
17-Jul-2001 GvR 1011 Descr-branch, just to be on the safe side
25-Jan-2001 FLD 1010 Parameters added to PyCode_New() and
Expand Down
12 changes: 10 additions & 2 deletions Include/stringobject.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ functions should be applied to nil objects.
*/

/* Caching the hash (ob_shash) saves recalculation of a string's hash value.
Interning strings (ob_sinterned) tries to ensure that only one string
Interning strings (ob_sstate) tries to ensure that only one string
object with a given value exists, so equality tests can be one pointer
comparison. This is generally restricted to strings that "look like"
Python identifiers, although the intern() builtin can be used to force
Expand All @@ -35,10 +35,14 @@ functions should be applied to nil objects.
typedef struct {
PyObject_VAR_HEAD
long ob_shash;
PyObject *ob_sinterned;
int ob_sstate;
char ob_sval[1];
} PyStringObject;

#define SSTATE_NOT_INTERNED 0
#define SSTATE_INTERNED_MORTAL 1
#define SSTATE_INTERNED_IMMORTAL 2

PyAPI_DATA(PyTypeObject) PyBaseString_Type;
PyAPI_DATA(PyTypeObject) PyString_Type;

Expand Down Expand Up @@ -66,9 +70,13 @@ extern DL_IMPORT(PyObject *) PyString_DecodeEscape(const char *, int,
const char *);

PyAPI_FUNC(void) PyString_InternInPlace(PyObject **);
PyAPI_FUNC(void) PyString_InternImmortal(PyObject **);
PyAPI_FUNC(PyObject *) PyString_InternFromString(const char *);
PyAPI_FUNC(void) _Py_ReleaseInternedStrings(void);

/* Use only if you know it's a string */
#define PyString_CHECK_INTERNED(op) (((PyStringObject *)(op))->ob_sstate)

/* Macro, trading safety for speed */
#define PyString_AS_STRING(op) (((PyStringObject *)(op))->ob_sval)
#define PyString_GET_SIZE(op) (((PyStringObject *)(op))->ob_size)
Expand Down
17 changes: 17 additions & 0 deletions Misc/NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,10 @@ Type/class unification and new-style classes

Core and builtins

- A subtle change to the semantics of the built-in function intern():
interned strings are no longer immortal. You must keep a reference
to the return value intern() around to get the benefit.

- Use of 'None' as a variable, argument or attribute name now
issues a SyntaxWarning. In the future, None may become a keyword.

Expand Down Expand Up @@ -514,6 +518,19 @@ Build

C API

- The string object's layout has changed: the pointer member
ob_sinterned has been replaced by an int member ob_sstate. On some
platforms (e.g. most 64-bit systems) this may change the offset of
the ob_sval member, so as a precaution the API_VERSION has been
incremented. The apparently unused feature of "indirect interned
strings", supported by the ob_sinterned member, is gone. Interned
strings are now usually mortal; theres a new API,
PyString_InternImmortal() that creates immortal interned strings.
(The ob_sstate member can only take three values; however, while
making it a char saves a few bytes per string object on average, in
it also slowed things down a bit because ob_sval was no longer
aligned.)

- The Py_InitModule*() functions now accept NULL for the 'methods'
argument. Modules without global functions are becoming more common
now that factories can be types rather than functions.
Expand Down
44 changes: 24 additions & 20 deletions Objects/classobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -2300,37 +2300,38 @@ instancemethod_traverse(PyMethodObject *im, visitproc visit, void *arg)
return 0;
}

static char *
getclassname(PyObject *class)
static void
getclassname(PyObject *class, char *buf, int bufsize)
{
PyObject *name;

assert(bufsize > 1);
strcpy(buf, "?"); /* Default outcome */
if (class == NULL)
name = NULL;
else
name = PyObject_GetAttrString(class, "__name__");
return;
name = PyObject_GetAttrString(class, "__name__");
if (name == NULL) {
/* This function cannot return an exception */
PyErr_Clear();
return "?";
return;
}
if (!PyString_Check(name)) {
Py_DECREF(name);
return "?";
if (PyString_Check(name)) {
strncpy(buf, PyString_AS_STRING(name), bufsize);
buf[bufsize-1] = '\0';
}
PyString_InternInPlace(&name);
Py_DECREF(name);
return PyString_AS_STRING(name);
}

static char *
getinstclassname(PyObject *inst)
static void
getinstclassname(PyObject *inst, char *buf, int bufsize)
{
PyObject *class;
char *name;

if (inst == NULL)
return "nothing";
if (inst == NULL) {
assert(bufsize > strlen("nothing"));
strcpy(buf, "nothing");
return;
}

class = PyObject_GetAttrString(inst, "__class__");
if (class == NULL) {
Expand All @@ -2339,9 +2340,8 @@ getinstclassname(PyObject *inst)
class = (PyObject *)(inst->ob_type);
Py_INCREF(class);
}
name = getclassname(class);
getclassname(class, buf, bufsize);
Py_XDECREF(class);
return name;
}

static PyObject *
Expand All @@ -2366,14 +2366,18 @@ instancemethod_call(PyObject *func, PyObject *arg, PyObject *kw)
return NULL;
}
if (!ok) {
char clsbuf[256];
char instbuf[256];
getclassname(class, clsbuf, sizeof(clsbuf));
getinstclassname(self, instbuf, sizeof(instbuf));
PyErr_Format(PyExc_TypeError,
"unbound method %s%s must be called with "
"%s instance as first argument "
"(got %s%s instead)",
PyEval_GetFuncName(func),
PyEval_GetFuncDesc(func),
getclassname(class),
getinstclassname(self),
clsbuf,
instbuf,
self == NULL ? "" : " instance");
return NULL;
}
Expand Down
12 changes: 3 additions & 9 deletions Objects/dictobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -511,15 +511,9 @@ PyDict_SetItem(register PyObject *op, PyObject *key, PyObject *value)
}
mp = (dictobject *)op;
if (PyString_CheckExact(key)) {
if (((PyStringObject *)key)->ob_sinterned != NULL) {
key = ((PyStringObject *)key)->ob_sinterned;
hash = ((PyStringObject *)key)->ob_shash;
}
else {
hash = ((PyStringObject *)key)->ob_shash;
if (hash == -1)
hash = PyObject_Hash(key);
}
hash = ((PyStringObject *)key)->ob_shash;
if (hash == -1)
hash = PyObject_Hash(key);
}
else {
hash = PyObject_Hash(key);
Expand Down
Loading

0 comments on commit 45ec02a

Please sign in to comment.