[
  {
    "path": ".gitignore",
    "content": "*~\n*.pyc\n*.pyo\ngdb-heap-*.tar.bz2\ntest_*\n\n"
  },
  {
    "path": "ChangeLog.rst",
    "content": "==========\nChange Log\n==========\n\n* Since glib v2.15, there can now be multiple allocation arenas support for\nmulti-threaded environments  (http://stackoverflow.com/questions/10706466/how-does-malloc-work-in-a-multithreaded-environment).\n\nA new command called \"heap arenas\" will now allow you to see how many arenas and their respective address locations\n\n"
  },
  {
    "path": "LICENSE-lgpl-2.1.txt",
    "content": "                  GNU LESSER GENERAL PUBLIC LICENSE\n                       Version 2.1, February 1999\n\n Copyright (C) 1991, 1999 Free Software Foundation, Inc.\n 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA\n Everyone is permitted to copy and distribute verbatim copies\n of this license document, but changing it is not allowed.\n\n[This is the first released version of the Lesser GPL.  It also counts\n as the successor of the GNU Library Public License, version 2, hence\n the version number 2.1.]\n\n                            Preamble\n\n  The licenses for most software are designed to take away your\nfreedom to share and change it.  By contrast, the GNU General Public\nLicenses are intended to guarantee your freedom to share and change\nfree software--to make sure the software is free for all its users.\n\n  This license, the Lesser General Public License, applies to some\nspecially designated software packages--typically libraries--of the\nFree Software Foundation and other authors who decide to use it.  You\ncan use it too, but we suggest you first think carefully about whether\nthis license or the ordinary General Public License is the better\nstrategy to use in any particular case, based on the explanations below.\n\n  When we speak of free software, we are referring to freedom of use,\nnot price.  Our General Public Licenses are designed to make sure that\nyou have the freedom to distribute copies of free software (and charge\nfor this service if you wish); that you receive source code or can get\nit if you want it; that you can change the software and use pieces of\nit in new free programs; and that you are informed that you can do\nthese things.\n\n  To protect your rights, we need to make restrictions that forbid\ndistributors to deny you these rights or to ask you to surrender these\nrights.  These restrictions translate to certain responsibilities for\nyou if you distribute copies of the library or if you modify it.\n\n  For example, if you distribute copies of the library, whether gratis\nor for a fee, you must give the recipients all the rights that we gave\nyou.  You must make sure that they, too, receive or can get the source\ncode.  If you link other code with the library, you must provide\ncomplete object files to the recipients, so that they can relink them\nwith the library after making changes to the library and recompiling\nit.  And you must show them these terms so they know their rights.\n\n  We protect your rights with a two-step method: (1) we copyright the\nlibrary, and (2) we offer you this license, which gives you legal\npermission to copy, distribute and/or modify the library.\n\n  To protect each distributor, we want to make it very clear that\nthere is no warranty for the free library.  Also, if the library is\nmodified by someone else and passed on, the recipients should know\nthat what they have is not the original version, so that the original\nauthor's reputation will not be affected by problems that might be\nintroduced by others.\n\f\n  Finally, software patents pose a constant threat to the existence of\nany free program.  We wish to make sure that a company cannot\neffectively restrict the users of a free program by obtaining a\nrestrictive license from a patent holder.  Therefore, we insist that\nany patent license obtained for a version of the library must be\nconsistent with the full freedom of use specified in this license.\n\n  Most GNU software, including some libraries, is covered by the\nordinary GNU General Public License.  This license, the GNU Lesser\nGeneral Public License, applies to certain designated libraries, and\nis quite different from the ordinary General Public License.  We use\nthis license for certain libraries in order to permit linking those\nlibraries into non-free programs.\n\n  When a program is linked with a library, whether statically or using\na shared library, the combination of the two is legally speaking a\ncombined work, a derivative of the original library.  The ordinary\nGeneral Public License therefore permits such linking only if the\nentire combination fits its criteria of freedom.  The Lesser General\nPublic License permits more lax criteria for linking other code with\nthe library.\n\n  We call this license the \"Lesser\" General Public License because it\ndoes Less to protect the user's freedom than the ordinary General\nPublic License.  It also provides other free software developers Less\nof an advantage over competing non-free programs.  These disadvantages\nare the reason we use the ordinary General Public License for many\nlibraries.  However, the Lesser license provides advantages in certain\nspecial circumstances.\n\n  For example, on rare occasions, there may be a special need to\nencourage the widest possible use of a certain library, so that it becomes\na de-facto standard.  To achieve this, non-free programs must be\nallowed to use the library.  A more frequent case is that a free\nlibrary does the same job as widely used non-free libraries.  In this\ncase, there is little to gain by limiting the free library to free\nsoftware only, so we use the Lesser General Public License.\n\n  In other cases, permission to use a particular library in non-free\nprograms enables a greater number of people to use a large body of\nfree software.  For example, permission to use the GNU C Library in\nnon-free programs enables many more people to use the whole GNU\noperating system, as well as its variant, the GNU/Linux operating\nsystem.\n\n  Although the Lesser General Public License is Less protective of the\nusers' freedom, it does ensure that the user of a program that is\nlinked with the Library has the freedom and the wherewithal to run\nthat program using a modified version of the Library.\n\n  The precise terms and conditions for copying, distribution and\nmodification follow.  Pay close attention to the difference between a\n\"work based on the library\" and a \"work that uses the library\".  The\nformer contains code derived from the library, whereas the latter must\nbe combined with the library in order to run.\n\f\n                  GNU LESSER GENERAL PUBLIC LICENSE\n   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION\n\n  0. This License Agreement applies to any software library or other\nprogram which contains a notice placed by the copyright holder or\nother authorized party saying it may be distributed under the terms of\nthis Lesser General Public License (also called \"this License\").\nEach licensee is addressed as \"you\".\n\n  A \"library\" means a collection of software functions and/or data\nprepared so as to be conveniently linked with application programs\n(which use some of those functions and data) to form executables.\n\n  The \"Library\", below, refers to any such software library or work\nwhich has been distributed under these terms.  A \"work based on the\nLibrary\" means either the Library or any derivative work under\ncopyright law: that is to say, a work containing the Library or a\nportion of it, either verbatim or with modifications and/or translated\nstraightforwardly into another language.  (Hereinafter, translation is\nincluded without limitation in the term \"modification\".)\n\n  \"Source code\" for a work means the preferred form of the work for\nmaking modifications to it.  For a library, complete source code means\nall the source code for all modules it contains, plus any associated\ninterface definition files, plus the scripts used to control compilation\nand installation of the library.\n\n  Activities other than copying, distribution and modification are not\ncovered by this License; they are outside its scope.  The act of\nrunning a program using the Library is not restricted, and output from\nsuch a program is covered only if its contents constitute a work based\non the Library (independent of the use of the Library in a tool for\nwriting it).  Whether that is true depends on what the Library does\nand what the program that uses the Library does.\n\n  1. You may copy and distribute verbatim copies of the Library's\ncomplete source code as you receive it, in any medium, provided that\nyou conspicuously and appropriately publish on each copy an\nappropriate copyright notice and disclaimer of warranty; keep intact\nall the notices that refer to this License and to the absence of any\nwarranty; and distribute a copy of this License along with the\nLibrary.\n\n  You may charge a fee for the physical act of transferring a copy,\nand you may at your option offer warranty protection in exchange for a\nfee.\n\f\n  2. You may modify your copy or copies of the Library or any portion\nof it, thus forming a work based on the Library, and copy and\ndistribute such modifications or work under the terms of Section 1\nabove, provided that you also meet all of these conditions:\n\n    a) The modified work must itself be a software library.\n\n    b) You must cause the files modified to carry prominent notices\n    stating that you changed the files and the date of any change.\n\n    c) You must cause the whole of the work to be licensed at no\n    charge to all third parties under the terms of this License.\n\n    d) If a facility in the modified Library refers to a function or a\n    table of data to be supplied by an application program that uses\n    the facility, other than as an argument passed when the facility\n    is invoked, then you must make a good faith effort to ensure that,\n    in the event an application does not supply such function or\n    table, the facility still operates, and performs whatever part of\n    its purpose remains meaningful.\n\n    (For example, a function in a library to compute square roots has\n    a purpose that is entirely well-defined independent of the\n    application.  Therefore, Subsection 2d requires that any\n    application-supplied function or table used by this function must\n    be optional: if the application does not supply it, the square\n    root function must still compute square roots.)\n\nThese requirements apply to the modified work as a whole.  If\nidentifiable sections of that work are not derived from the Library,\nand can be reasonably considered independent and separate works in\nthemselves, then this License, and its terms, do not apply to those\nsections when you distribute them as separate works.  But when you\ndistribute the same sections as part of a whole which is a work based\non the Library, the distribution of the whole must be on the terms of\nthis License, whose permissions for other licensees extend to the\nentire whole, and thus to each and every part regardless of who wrote\nit.\n\nThus, it is not the intent of this section to claim rights or contest\nyour rights to work written entirely by you; rather, the intent is to\nexercise the right to control the distribution of derivative or\ncollective works based on the Library.\n\nIn addition, mere aggregation of another work not based on the Library\nwith the Library (or with a work based on the Library) on a volume of\na storage or distribution medium does not bring the other work under\nthe scope of this License.\n\n  3. You may opt to apply the terms of the ordinary GNU General Public\nLicense instead of this License to a given copy of the Library.  To do\nthis, you must alter all the notices that refer to this License, so\nthat they refer to the ordinary GNU General Public License, version 2,\ninstead of to this License.  (If a newer version than version 2 of the\nordinary GNU General Public License has appeared, then you can specify\nthat version instead if you wish.)  Do not make any other change in\nthese notices.\n\f\n  Once this change is made in a given copy, it is irreversible for\nthat copy, so the ordinary GNU General Public License applies to all\nsubsequent copies and derivative works made from that copy.\n\n  This option is useful when you wish to copy part of the code of\nthe Library into a program that is not a library.\n\n  4. You may copy and distribute the Library (or a portion or\nderivative of it, under Section 2) in object code or executable form\nunder the terms of Sections 1 and 2 above provided that you accompany\nit with the complete corresponding machine-readable source code, which\nmust be distributed under the terms of Sections 1 and 2 above on a\nmedium customarily used for software interchange.\n\n  If distribution of object code is made by offering access to copy\nfrom a designated place, then offering equivalent access to copy the\nsource code from the same place satisfies the requirement to\ndistribute the source code, even though third parties are not\ncompelled to copy the source along with the object code.\n\n  5. A program that contains no derivative of any portion of the\nLibrary, but is designed to work with the Library by being compiled or\nlinked with it, is called a \"work that uses the Library\".  Such a\nwork, in isolation, is not a derivative work of the Library, and\ntherefore falls outside the scope of this License.\n\n  However, linking a \"work that uses the Library\" with the Library\ncreates an executable that is a derivative of the Library (because it\ncontains portions of the Library), rather than a \"work that uses the\nlibrary\".  The executable is therefore covered by this License.\nSection 6 states terms for distribution of such executables.\n\n  When a \"work that uses the Library\" uses material from a header file\nthat is part of the Library, the object code for the work may be a\nderivative work of the Library even though the source code is not.\nWhether this is true is especially significant if the work can be\nlinked without the Library, or if the work is itself a library.  The\nthreshold for this to be true is not precisely defined by law.\n\n  If such an object file uses only numerical parameters, data\nstructure layouts and accessors, and small macros and small inline\nfunctions (ten lines or less in length), then the use of the object\nfile is unrestricted, regardless of whether it is legally a derivative\nwork.  (Executables containing this object code plus portions of the\nLibrary will still fall under Section 6.)\n\n  Otherwise, if the work is a derivative of the Library, you may\ndistribute the object code for the work under the terms of Section 6.\nAny executables containing that work also fall under Section 6,\nwhether or not they are linked directly with the Library itself.\n\f\n  6. As an exception to the Sections above, you may also combine or\nlink a \"work that uses the Library\" with the Library to produce a\nwork containing portions of the Library, and distribute that work\nunder terms of your choice, provided that the terms permit\nmodification of the work for the customer's own use and reverse\nengineering for debugging such modifications.\n\n  You must give prominent notice with each copy of the work that the\nLibrary is used in it and that the Library and its use are covered by\nthis License.  You must supply a copy of this License.  If the work\nduring execution displays copyright notices, you must include the\ncopyright notice for the Library among them, as well as a reference\ndirecting the user to the copy of this License.  Also, you must do one\nof these things:\n\n    a) Accompany the work with the complete corresponding\n    machine-readable source code for the Library including whatever\n    changes were used in the work (which must be distributed under\n    Sections 1 and 2 above); and, if the work is an executable linked\n    with the Library, with the complete machine-readable \"work that\n    uses the Library\", as object code and/or source code, so that the\n    user can modify the Library and then relink to produce a modified\n    executable containing the modified Library.  (It is understood\n    that the user who changes the contents of definitions files in the\n    Library will not necessarily be able to recompile the application\n    to use the modified definitions.)\n\n    b) Use a suitable shared library mechanism for linking with the\n    Library.  A suitable mechanism is one that (1) uses at run time a\n    copy of the library already present on the user's computer system,\n    rather than copying library functions into the executable, and (2)\n    will operate properly with a modified version of the library, if\n    the user installs one, as long as the modified version is\n    interface-compatible with the version that the work was made with.\n\n    c) Accompany the work with a written offer, valid for at\n    least three years, to give the same user the materials\n    specified in Subsection 6a, above, for a charge no more\n    than the cost of performing this distribution.\n\n    d) If distribution of the work is made by offering access to copy\n    from a designated place, offer equivalent access to copy the above\n    specified materials from the same place.\n\n    e) Verify that the user has already received a copy of these\n    materials or that you have already sent this user a copy.\n\n  For an executable, the required form of the \"work that uses the\nLibrary\" must include any data and utility programs needed for\nreproducing the executable from it.  However, as a special exception,\nthe materials to be distributed need not include anything that is\nnormally distributed (in either source or binary form) with the major\ncomponents (compiler, kernel, and so on) of the operating system on\nwhich the executable runs, unless that component itself accompanies\nthe executable.\n\n  It may happen that this requirement contradicts the license\nrestrictions of other proprietary libraries that do not normally\naccompany the operating system.  Such a contradiction means you cannot\nuse both them and the Library together in an executable that you\ndistribute.\n\f\n  7. You may place library facilities that are a work based on the\nLibrary side-by-side in a single library together with other library\nfacilities not covered by this License, and distribute such a combined\nlibrary, provided that the separate distribution of the work based on\nthe Library and of the other library facilities is otherwise\npermitted, and provided that you do these two things:\n\n    a) Accompany the combined library with a copy of the same work\n    based on the Library, uncombined with any other library\n    facilities.  This must be distributed under the terms of the\n    Sections above.\n\n    b) Give prominent notice with the combined library of the fact\n    that part of it is a work based on the Library, and explaining\n    where to find the accompanying uncombined form of the same work.\n\n  8. You may not copy, modify, sublicense, link with, or distribute\nthe Library except as expressly provided under this License.  Any\nattempt otherwise to copy, modify, sublicense, link with, or\ndistribute the Library is void, and will automatically terminate your\nrights under this License.  However, parties who have received copies,\nor rights, from you under this License will not have their licenses\nterminated so long as such parties remain in full compliance.\n\n  9. You are not required to accept this License, since you have not\nsigned it.  However, nothing else grants you permission to modify or\ndistribute the Library or its derivative works.  These actions are\nprohibited by law if you do not accept this License.  Therefore, by\nmodifying or distributing the Library (or any work based on the\nLibrary), you indicate your acceptance of this License to do so, and\nall its terms and conditions for copying, distributing or modifying\nthe Library or works based on it.\n\n  10. Each time you redistribute the Library (or any work based on the\nLibrary), the recipient automatically receives a license from the\noriginal licensor to copy, distribute, link with or modify the Library\nsubject to these terms and conditions.  You may not impose any further\nrestrictions on the recipients' exercise of the rights granted herein.\nYou are not responsible for enforcing compliance by third parties with\nthis License.\n\f\n  11. If, as a consequence of a court judgment or allegation of patent\ninfringement or for any other reason (not limited to patent issues),\nconditions are imposed on you (whether by court order, agreement or\notherwise) that contradict the conditions of this License, they do not\nexcuse you from the conditions of this License.  If you cannot\ndistribute so as to satisfy simultaneously your obligations under this\nLicense and any other pertinent obligations, then as a consequence you\nmay not distribute the Library at all.  For example, if a patent\nlicense would not permit royalty-free redistribution of the Library by\nall those who receive copies directly or indirectly through you, then\nthe only way you could satisfy both it and this License would be to\nrefrain entirely from distribution of the Library.\n\nIf any portion of this section is held invalid or unenforceable under any\nparticular circumstance, the balance of the section is intended to apply,\nand the section as a whole is intended to apply in other circumstances.\n\nIt is not the purpose of this section to induce you to infringe any\npatents or other property right claims or to contest validity of any\nsuch claims; this section has the sole purpose of protecting the\nintegrity of the free software distribution system which is\nimplemented by public license practices.  Many people have made\ngenerous contributions to the wide range of software distributed\nthrough that system in reliance on consistent application of that\nsystem; it is up to the author/donor to decide if he or she is willing\nto distribute software through any other system and a licensee cannot\nimpose that choice.\n\nThis section is intended to make thoroughly clear what is believed to\nbe a consequence of the rest of this License.\n\n  12. If the distribution and/or use of the Library is restricted in\ncertain countries either by patents or by copyrighted interfaces, the\noriginal copyright holder who places the Library under this License may add\nan explicit geographical distribution limitation excluding those countries,\nso that distribution is permitted only in or among countries not thus\nexcluded.  In such case, this License incorporates the limitation as if\nwritten in the body of this License.\n\n  13. The Free Software Foundation may publish revised and/or new\nversions of the Lesser General Public License from time to time.\nSuch new versions will be similar in spirit to the present version,\nbut may differ in detail to address new problems or concerns.\n\nEach version is given a distinguishing version number.  If the Library\nspecifies a version number of this License which applies to it and\n\"any later version\", you have the option of following the terms and\nconditions either of that version or of any later version published by\nthe Free Software Foundation.  If the Library does not specify a\nlicense version number, you may choose any version ever published by\nthe Free Software Foundation.\n\f\n  14. If you wish to incorporate parts of the Library into other free\nprograms whose distribution conditions are incompatible with these,\nwrite to the author to ask for permission.  For software which is\ncopyrighted by the Free Software Foundation, write to the Free\nSoftware Foundation; we sometimes make exceptions for this.  Our\ndecision will be guided by the two goals of preserving the free status\nof all derivatives of our free software and of promoting the sharing\nand reuse of software generally.\n\n                            NO WARRANTY\n\n  15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO\nWARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.\nEXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR\nOTHER PARTIES PROVIDE THE LIBRARY \"AS IS\" WITHOUT WARRANTY OF ANY\nKIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE\nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR\nPURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE\nLIBRARY IS WITH YOU.  SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME\nTHE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.\n\n  16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN\nWRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY\nAND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU\nFOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR\nCONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE\nLIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING\nRENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A\nFAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF\nSUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH\nDAMAGES.\n\n                     END OF TERMS AND CONDITIONS\n\f\n           How to Apply These Terms to Your New Libraries\n\n  If you develop a new library, and you want it to be of the greatest\npossible use to the public, we recommend making it free software that\neveryone can redistribute and change.  You can do so by permitting\nredistribution under these terms (or, alternatively, under the terms of the\nordinary General Public License).\n\n  To apply these terms, attach the following notices to the library.  It is\nsafest to attach them to the start of each source file to most effectively\nconvey the exclusion of warranty; and each file should have at least the\n\"copyright\" line and a pointer to where the full notice is found.\n\n    <one line to give the library's name and a brief idea of what it does.>\n    Copyright (C) <year>  <name of author>\n\n    This library is free software; you can redistribute it and/or\n    modify it under the terms of the GNU Lesser General Public\n    License as published by the Free Software Foundation; either\n    version 2.1 of the License, or (at your option) any later version.\n\n    This library is distributed in the hope that it will be useful,\n    but WITHOUT ANY WARRANTY; without even the implied warranty of\n    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n    Lesser General Public License for more details.\n\n    You should have received a copy of the GNU Lesser General Public\n    License along with this library; if not, write to the Free Software\n    Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA\n\nAlso add information on how to contact you by electronic and paper mail.\n\nYou should also get your employer (if you work as a programmer) or your\nschool, if any, to sign a \"copyright disclaimer\" for the library, if\nnecessary.  Here is a sample; alter the names:\n\n  Yoyodyne, Inc., hereby disclaims all copyright interest in the\n  library `Frob' (a library for tweaking knobs) written by James Random Hacker.\n\n  <signature of Ty Coon>, 1 April 1990\n  Ty Coon, President of Vice\n\nThat's all there is to it!\n"
  },
  {
    "path": "LICENSE-python.txt",
    "content": "A. HISTORY OF THE SOFTWARE\n==========================\n\nPython was created in the early 1990s by Guido van Rossum at Stichting\nMathematisch Centrum (CWI, see http://www.cwi.nl) in the Netherlands\nas a successor of a language called ABC.  Guido remains Python's\nprincipal author, although it includes many contributions from others.\n\nIn 1995, Guido continued his work on Python at the Corporation for\nNational Research Initiatives (CNRI, see http://www.cnri.reston.va.us)\nin Reston, Virginia where he released several versions of the\nsoftware.\n\nIn May 2000, Guido and the Python core development team moved to\nBeOpen.com to form the BeOpen PythonLabs team.  In October of the same\nyear, the PythonLabs team moved to Digital Creations (now Zope\nCorporation, see http://www.zope.com).  In 2001, the Python Software\nFoundation (PSF, see http://www.python.org/psf/) was formed, a\nnon-profit organization created specifically to own Python-related\nIntellectual Property.  Zope Corporation is a sponsoring member of\nthe PSF.\n\nAll Python releases are Open Source (see http://www.opensource.org for\nthe Open Source Definition).  Historically, most, but not all, Python\nreleases have also been GPL-compatible; the table below summarizes\nthe various releases.\n\n    Release         Derived     Year        Owner       GPL-\n                    from                                compatible? (1)\n\n    0.9.0 thru 1.2              1991-1995   CWI         yes\n    1.3 thru 1.5.2  1.2         1995-1999   CNRI        yes\n    1.6             1.5.2       2000        CNRI        no\n    2.0             1.6         2000        BeOpen.com  no\n    1.6.1           1.6         2001        CNRI        yes (2)\n    2.1             2.0+1.6.1   2001        PSF         no\n    2.0.1           2.0+1.6.1   2001        PSF         yes\n    2.1.1           2.1+2.0.1   2001        PSF         yes\n    2.2             2.1.1       2001        PSF         yes\n    2.1.2           2.1.1       2002        PSF         yes\n    2.1.3           2.1.2       2002        PSF         yes\n    2.2.1           2.2         2002        PSF         yes\n    2.2.2           2.2.1       2002        PSF         yes\n    2.2.3           2.2.2       2003        PSF         yes\n    2.3             2.2.2       2002-2003   PSF         yes\n    2.3.1           2.3         2002-2003   PSF         yes\n    2.3.2           2.3.1       2002-2003   PSF         yes\n    2.3.3           2.3.2       2002-2003   PSF         yes\n    2.3.4           2.3.3       2004        PSF         yes\n    2.3.5           2.3.4       2005        PSF         yes\n    2.4             2.3         2004        PSF         yes\n    2.4.1           2.4         2005        PSF         yes\n    2.4.2           2.4.1       2005        PSF         yes\n    2.4.3           2.4.2       2006        PSF         yes\n    2.4.4           2.4.3       2006        PSF         yes\n    2.5             2.4         2006        PSF         yes\n    2.5.1           2.5         2007        PSF         yes\n    2.5.2           2.5.1       2008        PSF         yes\n    2.5.3           2.5.2       2008        PSF         yes\n    2.6             2.5         2008        PSF         yes\n    2.6.1           2.6         2008        PSF         yes\n    2.6.2           2.6.1       2009        PSF         yes\n    2.6.3           2.6.2       2009        PSF         yes\n    2.6.4           2.6.3       2009        PSF         yes\n    2.6.5           2.6.4       2010        PSF         yes\n\nFootnotes:\n\n(1) GPL-compatible doesn't mean that we're distributing Python under\n    the GPL.  All Python licenses, unlike the GPL, let you distribute\n    a modified version without making your changes open source.  The\n    GPL-compatible licenses make it possible to combine Python with\n    other software that is released under the GPL; the others don't.\n\n(2) According to Richard Stallman, 1.6.1 is not GPL-compatible,\n    because its license has a choice of law clause.  According to\n    CNRI, however, Stallman's lawyer has told CNRI's lawyer that 1.6.1\n    is \"not incompatible\" with the GPL.\n\nThanks to the many outside volunteers who have worked under Guido's\ndirection to make these releases possible.\n\n\nB. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON\n===============================================================\n\nPYTHON SOFTWARE FOUNDATION LICENSE VERSION 2\n--------------------------------------------\n\n1. This LICENSE AGREEMENT is between the Python Software Foundation\n(\"PSF\"), and the Individual or Organization (\"Licensee\") accessing and\notherwise using this software (\"Python\") in source or binary form and\nits associated documentation.\n\n2. Subject to the terms and conditions of this License Agreement, PSF hereby\ngrants Licensee a nonexclusive, royalty-free, world-wide license to reproduce,\nanalyze, test, perform and/or display publicly, prepare derivative works,\ndistribute, and otherwise use Python alone or in any derivative version,\nprovided, however, that PSF's License Agreement and PSF's notice of copyright,\ni.e., \"Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010\nPython Software Foundation; All Rights Reserved\" are retained in Python alone or\nin any derivative version prepared by Licensee.\n\n3. In the event Licensee prepares a derivative work that is based on\nor incorporates Python or any part thereof, and wants to make\nthe derivative work available to others as provided herein, then\nLicensee hereby agrees to include in any such work a brief summary of\nthe changes made to Python.\n\n4. PSF is making Python available to Licensee on an \"AS IS\"\nbasis.  PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR\nIMPLIED.  BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND\nDISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS\nFOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT\nINFRINGE ANY THIRD PARTY RIGHTS.\n\n5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON\nFOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS\nA RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON,\nOR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.\n\n6. This License Agreement will automatically terminate upon a material\nbreach of its terms and conditions.\n\n7. Nothing in this License Agreement shall be deemed to create any\nrelationship of agency, partnership, or joint venture between PSF and\nLicensee.  This License Agreement does not grant permission to use PSF\ntrademarks or trade name in a trademark sense to endorse or promote\nproducts or services of Licensee, or any third party.\n\n8. By copying, installing or otherwise using Python, Licensee\nagrees to be bound by the terms and conditions of this License\nAgreement.\n\n\nBEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0\n-------------------------------------------\n\nBEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1\n\n1. This LICENSE AGREEMENT is between BeOpen.com (\"BeOpen\"), having an\noffice at 160 Saratoga Avenue, Santa Clara, CA 95051, and the\nIndividual or Organization (\"Licensee\") accessing and otherwise using\nthis software in source or binary form and its associated\ndocumentation (\"the Software\").\n\n2. Subject to the terms and conditions of this BeOpen Python License\nAgreement, BeOpen hereby grants Licensee a non-exclusive,\nroyalty-free, world-wide license to reproduce, analyze, test, perform\nand/or display publicly, prepare derivative works, distribute, and\notherwise use the Software alone or in any derivative version,\nprovided, however, that the BeOpen Python License is retained in the\nSoftware, alone or in any derivative version prepared by Licensee.\n\n3. BeOpen is making the Software available to Licensee on an \"AS IS\"\nbasis.  BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR\nIMPLIED.  BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND\nDISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS\nFOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT\nINFRINGE ANY THIRD PARTY RIGHTS.\n\n4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE\nSOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS\nAS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY\nDERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.\n\n5. This License Agreement will automatically terminate upon a material\nbreach of its terms and conditions.\n\n6. This License Agreement shall be governed by and interpreted in all\nrespects by the law of the State of California, excluding conflict of\nlaw provisions.  Nothing in this License Agreement shall be deemed to\ncreate any relationship of agency, partnership, or joint venture\nbetween BeOpen and Licensee.  This License Agreement does not grant\npermission to use BeOpen trademarks or trade names in a trademark\nsense to endorse or promote products or services of Licensee, or any\nthird party.  As an exception, the \"BeOpen Python\" logos available at\nhttp://www.pythonlabs.com/logos.html may be used according to the\npermissions granted on that web page.\n\n7. By copying, installing or otherwise using the software, Licensee\nagrees to be bound by the terms and conditions of this License\nAgreement.\n\n\nCNRI LICENSE AGREEMENT FOR PYTHON 1.6.1\n---------------------------------------\n\n1. This LICENSE AGREEMENT is between the Corporation for National\nResearch Initiatives, having an office at 1895 Preston White Drive,\nReston, VA 20191 (\"CNRI\"), and the Individual or Organization\n(\"Licensee\") accessing and otherwise using Python 1.6.1 software in\nsource or binary form and its associated documentation.\n\n2. Subject to the terms and conditions of this License Agreement, CNRI\nhereby grants Licensee a nonexclusive, royalty-free, world-wide\nlicense to reproduce, analyze, test, perform and/or display publicly,\nprepare derivative works, distribute, and otherwise use Python 1.6.1\nalone or in any derivative version, provided, however, that CNRI's\nLicense Agreement and CNRI's notice of copyright, i.e., \"Copyright (c)\n1995-2001 Corporation for National Research Initiatives; All Rights\nReserved\" are retained in Python 1.6.1 alone or in any derivative\nversion prepared by Licensee.  Alternately, in lieu of CNRI's License\nAgreement, Licensee may substitute the following text (omitting the\nquotes): \"Python 1.6.1 is made available subject to the terms and\nconditions in CNRI's License Agreement.  This Agreement together with\nPython 1.6.1 may be located on the Internet using the following\nunique, persistent identifier (known as a handle): 1895.22/1013.  This\nAgreement may also be obtained from a proxy server on the Internet\nusing the following URL: http://hdl.handle.net/1895.22/1013\".\n\n3. In the event Licensee prepares a derivative work that is based on\nor incorporates Python 1.6.1 or any part thereof, and wants to make\nthe derivative work available to others as provided herein, then\nLicensee hereby agrees to include in any such work a brief summary of\nthe changes made to Python 1.6.1.\n\n4. CNRI is making Python 1.6.1 available to Licensee on an \"AS IS\"\nbasis.  CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR\nIMPLIED.  BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND\nDISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS\nFOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT\nINFRINGE ANY THIRD PARTY RIGHTS.\n\n5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON\n1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS\nA RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1,\nOR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.\n\n6. This License Agreement will automatically terminate upon a material\nbreach of its terms and conditions.\n\n7. This License Agreement shall be governed by the federal\nintellectual property law of the United States, including without\nlimitation the federal copyright law, and, to the extent such\nU.S. federal law does not apply, by the law of the Commonwealth of\nVirginia, excluding Virginia's conflict of law provisions.\nNotwithstanding the foregoing, with regard to derivative works based\non Python 1.6.1 that incorporate non-separable material that was\npreviously distributed under the GNU General Public License (GPL), the\nlaw of the Commonwealth of Virginia shall govern this License\nAgreement only as to issues arising under or with respect to\nParagraphs 4, 5, and 7 of this License Agreement.  Nothing in this\nLicense Agreement shall be deemed to create any relationship of\nagency, partnership, or joint venture between CNRI and Licensee.  This\nLicense Agreement does not grant permission to use CNRI trademarks or\ntrade name in a trademark sense to endorse or promote products or\nservices of Licensee, or any third party.\n\n8. By clicking on the \"ACCEPT\" button where indicated, or by copying,\ninstalling or otherwise using Python 1.6.1, Licensee agrees to be\nbound by the terms and conditions of this License Agreement.\n\n        ACCEPT\n\n\nCWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2\n--------------------------------------------------\n\nCopyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam,\nThe Netherlands.  All rights reserved.\n\nPermission to use, copy, modify, and distribute this software and its\ndocumentation for any purpose and without fee is hereby granted,\nprovided that the above copyright notice appear in all copies and that\nboth that copyright notice and this permission notice appear in\nsupporting documentation, and that the name of Stichting Mathematisch\nCentrum or CWI not be used in advertising or publicity pertaining to\ndistribution of the software without specific, written prior\npermission.\n\nSTICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO\nTHIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND\nFITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE\nFOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES\nWHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN\nACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT\nOF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.\n"
  },
  {
    "path": "LICENSE.txt",
    "content": "gdb-heap is licensed under the LGPLv2.1, with the exception of heap/python.py,\nwhich is licensed under the PSF license.\n"
  },
  {
    "path": "README.md",
    "content": "gdb-heap\n========\n\nOriginal fork derived from `https://fedorahosted.org/gdb-heap/`.  This repo is now considered the official repository for the gdb-heap library.\n\nInstallation instructions\n-------------------------\n1. To get this module working with Ubuntu 16.04, make sure you have the following packages installed:\n\n```\nsudo apt-get install libc6-dev libc6-dbg python-gi libglib2.0-0-dbg python-ply\n```\n\nThe original forked version assumes an \"import gdb\" module, which resides in\n\"/usr/share/glib-2.0/gdb\" as part of the `libglib2.0-0-dbg` package.  Earlier versions\nof Ubuntu have this library is located in the `ibglib2.0-dev` package.\n\nThere is also a conflict with the python-gobject-2 library, which are deprecated\nPython bindings for the GObject library.  This package installs a glib/\ndirectory inside /usr/lib/python2.7/dist-packages/glib/option.py, which many\nGtk-related modules depend.  You will therefore need to make sure the sys.path\nfor /usr/share/glib-2.0/gdb is declared first for this reason (see code\nexample).\n\nYou'll also want to install python-dbg since the package comes with the\ndebugging symbols for the stock Python 2.7, as well as a python-dbg binary\ncompiled with the --with-pydebug option that will only work with C extensions\nmodules compiled with the /usr/include/python2.7_d headers.\n\nNOTE: The Python binary that accompanies Ubuntu distributions uses link-time\noptimization compilation.  As a result, many of the Python data structures are\noptimized out and prevent gdb-heap from being able to properly categorize the\nvarious data structures.  To take advantage of this capability, you will need to\ndownload the Python source and recompile without using the -flto option in\nthe CFLAGS/LDFLAGS configuration option.  Normally this capability is not used in\nstandard configure so simply compiling it should do the trick.  (If you want\nto have SSL support in this binary, make sure to edit Modules/Setup.dist).\n\nThe python-dbg binary is compiled with the Py_TRACE_REFS conditional via the\n--pydebug which modifies the internal Python data structures and adds two\npointers into every base PyObject, preventing previously compiled C extensions\nto be used.  Using your own compiled version of Python is therefore the way to\ngo if you want to take advantage of the categorize features of gdb-heap and/or\ninspecting the internal memory structures of Python.\n\n2. Create a file that will help automate the loading of the gdbheap library:\n\ngdb-heap-commands:\n\n```\npython\nimport sys\nsys.path.insert(0, \"/usr/share/glib-2.0/gdb\")\nsys.path.append(\"/usr/share/glib-2.0/gdb\")\nsys.path.append(\"/home/rhu/projects/gdb-heap\")\nimport gdbheap\nend\n```\n\nTo attach to an existing process, you can execute as follows:\n\n```bash\nsudo gdb -p 7458 -x ~/gdb-heap-commands\n```\n\nTo take a core dump of a process, you can do the following:\n\n```\n1) sudo gdb -p <pid>\n2) Type \"generate-core-file\" at the GDB prompt.\n3) Wait awhile (and be careful not to hit enter again, since it will repeat the same command)\n4) Copy the core.<pid> file somewhere.\n```\n\nYou can then use gdb to attach to this core file:\n\n```bash\nsudo gdb python <core file> -x ~/gdb-heap-commands\n```\n\n\nCommands to run\n---------------\n\n```\nheap - print a report on memory usage, by category\nheap sizes - print a report on memory usage, by sizes\nheap used - print used heap chunks\nheap free - print free heap chunks\nheap all - print all heap chunks\nheap log - print a log of recorded heap states\nheap label - record the current state of the heap for later comparison\nheap diff - compare two states of the heap\nheap select - query used heap chunks\nhexdump <addr> [-c] - print a hexdump, stating at the specific region of memory (expose hex characters with -c option)\nheap arenas - print glibs arenas\nheap arena <arena> - select glibc arena number\n```\n\nUseful resources\n----------------\n\n * http://blip.tv/pycon-us-videos-2009-2010-2011/pycon-2011-dude-where-s-my-ram-a-deep-dive-into-how-python-uses-memory-4896725 (Dude - Where's My RAM?  A deep dive in how Python uses memory - David Malcom's PyCon 2011 video talk)\n\n * http://dmalcolm.fedorapeople.org/presentations/PyCon-US-2011/GdbPythonPresentation/GdbPython.html (David Malcom's PyCon 2011 slides)\n\n * http://code.woboq.org/userspace/glibc/malloc/malloc.c.html (malloc.c.html implementation)\n\n * Malloc per-thread arenas in glibc (http://siddhesh.in/journal/2012/10/24/malloc-per-thread-arenas-in-glibc/)\n\n * Understanding the heap by breaking it (http://www.blackhat.com/presentations/bh-usa-07/Ferguson/Whitepaper/bh-usa-07-ferguson-WP.pdf)\n\n * Building your own Python version for an easier debugging experience (http://hustoknow.blogspot.com/2014/06/how-to-troubleshoot-your-python.html)"
  },
  {
    "path": "gdbheap.py",
    "content": "# Copyright (C) 2010  David Hugh Malcolm\n#\n# This library is free software; you can redistribute it and/or\n# modify it under the terms of the GNU Lesser General Public\n# License as published by the Free Software Foundation; either\n# version 2.1 of the License, or (at your option) any later version.\n#\n# This library is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n# Lesser General Public License for more details.\n#\n# You should have received a copy of the GNU Lesser General Public\n# License along with this library; if not, write to the Free Software\n# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA\n\nfrom heap.commands import register_commands\n\n# Register the commands with gdb:\nregister_commands()\n\n"
  },
  {
    "path": "heap/__init__.py",
    "content": "# Copyright (C) 2010  David Hugh Malcolm\n#\n# This library is free software; you can redistribute it and/or\n# modify it under the terms of the GNU Lesser General Public\n# License as published by the Free Software Foundation; either\n# version 2.1 of the License, or (at your option) any later version.\n#\n# This library is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n# Lesser General Public License for more details.\n#\n# You should have received a copy of the GNU Lesser General Public\n# License along with this library; if not, write to the Free Software\n# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA\n\nfrom collections import namedtuple\n\ntry:\n    import gdb\n\n\n    # We defer most type lookups to when they're needed, since they'll fail if the\n    # DWARF data for the relevant DSO hasn't been loaded yet, which is typically\n    # the case for an executable dynamically linked against glibc\n\n    type_void_ptr = gdb.lookup_type('void').pointer()\n    type_char_ptr = gdb.lookup_type('char').pointer()\n    type_unsigned_char_ptr = gdb.lookup_type('unsigned char').pointer()\n    sizeof_ptr = type_void_ptr.sizeof\n\n    if sizeof_ptr == 4:\n        def fmt_addr(addr):\n            return '0x%08x' % addr\n    else:\n        # Assume 64-bit:\n        def fmt_addr(addr):\n            return '0x%016x' % addr\n\nexcept ImportError:\n    # Support importing heap.parser from outside gdb\n    pass\n\n\nclass WrongInferiorProcess(RuntimeError):\n    def __init__(self, hint):\n        self.hint = hint\n\nNUM_HEXDUMP_BYTES = 20\n\n__type_cache = {}\n\ndef caching_lookup_type(typename):\n    '''Adds caching to gdb.lookup_type(), whilst still raising RuntimeError if\n    the type isn't found.'''\n    if typename in __type_cache:\n        gdbtype = __type_cache[typename]\n        if gdbtype:\n            return gdbtype\n        raise RuntimeError('(cached) Could not find type \"%s\"' % typename)\n    try:\n        if 0:\n            print('type cache miss: %r' % typename)\n        gdbtype = gdb.lookup_type(typename).strip_typedefs()\n    except RuntimeError as e:\n        # did not find the type: add a None to the cache\n        gdbtype = None\n    __type_cache[typename] = gdbtype\n    if gdbtype:\n        return gdbtype\n    raise RuntimeError('Could not find type \"%s\"' % typename)\n\ndef array_length(_gdbval):\n    '''Given a gdb.Value that's an array, determine the number of elements in\n    the array'''\n    arr_size = _gdbval.type.sizeof\n    elem_size = _gdbval[0].type.sizeof\n    return arr_size/elem_size\n\ndef offsetof(typename, fieldname):\n    '''Get the offset (in bytes) from the start of the given type to the given\n    field'''\n\n    # This is a transliteration to gdb's python API of:\n    #    (int)(void*)&((#typename*)NULL)->#fieldname)\n\n    t = caching_lookup_type(typename).pointer()\n    v = gdb.Value(0)\n    v = v.cast(t)\n    field = v[fieldname].cast(type_void_ptr)\n    return int(field.address)\n\nclass MissingDebuginfo(RuntimeError):\n    def __init__(self, module):\n        self.module = module\n\ndef check_missing_debuginfo(err, module):\n    assert(isinstance(err, RuntimeError))\n    if err.args[0] == 'Attempt to extract a component of a value that is not a (null).':\n        # Then we likely are trying to extract a field from a struct but don't\n        # have the DWARF description of the fields of the struct loaded:\n        raise MissingDebuginfo(module)\n\nclass WrappedValue(object):\n    \"\"\"\n    Base class, wrapping an underlying gdb.Value adding various useful methods,\n    and allowing subclassing\n    \"\"\"\n    def __init__(self, gdbval):\n        self._gdbval = gdbval\n\n    # __getattr__ just made it too confusing\n    #def __getattr__(self, attr):\n    #    return WrappedValue(self.val[attr])\n\n    def field(self, attr):\n        return self._gdbval[attr]\n\n    def __str__(self):\n        return str(self._gdbval)\n\n    # See http://sourceware.org/gdb/onlinedocs/gdb/Values-From-Inferior.html#Values-From-Inferior\n    @property\n    def address(self):\n        return self._gdbval.address\n\n    @property\n    def is_optimized_out(self):\n        return self._gdbval.is_optimized_out\n\n    @property\n    def type(self):\n        return self._gdbval.type\n\n    @property\n    def dynamic_type(self):\n        return self._gdbval.dynamic_type\n\n    @property\n    def is_lazy(self):\n        return self._gdbval.is_lazy\n\n    def dereference(self):\n        return WrappedValue(self._gdbval.dereference())\n\n#    def address(self):\n#        return int(self._gdbval.cast(type_void_ptr))\n\n    def is_null(self):\n        return int(self._gdbval) == 0\n\nclass WrappedPointer(WrappedValue):\n    def as_address(self):\n        return int(self._gdbval.cast(type_void_ptr))\n\n    def __str__(self):\n        return ('<%s for inferior 0x%x>'\n                % (self.__class__.__name__,\n                   self.as_address()\n                   )\n                )\n\n    def cast(self, type_):\n        return WrappedPointer(self._gdbval.cast(type_))\n\n    def categorize_refs(self, usage_set, level=0, detail=None):\n        '''Hook for categorizing references known by the type this points to'''\n        # do nothing by default:\n        pass\n\n\ndef fmt_size(size):\n    '''\n    Pretty-formatting of numeric values: return a string, subdividing the\n    digits into groups of three, using commas\n    '''\n    s = str(size)\n    result = ''\n    while len(s)>3:\n        result = ',' + s[-3:] + result\n        s = s[0:-3]\n    result = s + result\n    return result\n\ndef as_hexdump_char(b):\n    '''Given a byte, return a string for use by hexdump, converting\n    non-printable/non-ASCII values as a period'''\n    if b>=0x20 and b < 0x80:\n        return chr(b)\n    else:\n        return '.'\n\ndef sign(amt):\n    if amt >= 0:\n        return '+'\n    else:\n        return '' # the '-' sign will come from the numeric repr\n\n\nclass Category(namedtuple('Category', ('domain', 'kind', 'detail'))):\n    '''\n    Categorization of an in-use area of memory\n\n      domain: high-level grouping e.g. \"python\", \"C++\", etc\n\n      kind: type information, appropriate to the domain e.g. a class/type\n\n        Domain     Meaning of 'kind'\n        ------     -----------------\n        'C++'      the C++ class\n        'python'   the python class\n        'cpython'  C structure/type (implementation detail within Python)\n        'pyarena'  Python memory allocator\n\n      detail: additional detail\n    '''\n\n    def __new__(_cls, domain, kind, detail=None):\n        return tuple.__new__(_cls, (domain, kind, detail))\n\n    def __str__(self):\n        return '%s:%s:%s' % (self.domain, self.kind, self.detail)\n\nclass Usage(object):\n    # Information about an in-use area of memory\n    slots = ('start', 'size', 'category', 'level', 'hd', 'obj')\n\n    def __init__(self, start, size, category=None, level=None, hd=None, obj=None):\n        assert isinstance(start, int)\n        assert isinstance(size, int)\n        if category:\n            assert isinstance(category, Category)\n        self.start = start\n        self.size = size\n        self.category = category\n        self.level = level\n        self.hd = hd\n        self.obj = obj\n\n    def __repr__(self):\n        result = 'Usage(%s, %s' % (hex(self.start), hex(self.size))\n        if self.category:\n            result += ', %r' % (self.category, )\n        if self.hd:\n            result += ', hd=%r' % self.hd\n        if self.obj:\n            result += ', obj=%r' % self.obj\n        return result + ')'\n\n    def ensure_category(self, usage_set=None):\n        if self.category is None:\n            self.category = categorize(self, usage_set)\n\n    def ensure_hexdump(self):\n        if self.hd is None:\n            self.hd = hexdump_as_bytes(self.start, NUM_HEXDUMP_BYTES)\n\n\ndef hexdump_as_bytes(addr, size, chars_only=True):\n    addr = gdb.Value(addr).cast(type_unsigned_char_ptr)\n    bytebuf = []\n    for j in range(size):\n        ptr = addr + j\n        b = int(ptr.dereference())\n        bytebuf.append(b)\n\n    result = ''\n    if not chars_only:\n        result += ' '.join(['%02x' % b for b in bytebuf]) + ' |'\n    result += ''.join([as_hexdump_char(b) for b in bytebuf])\n    result += '|'\n\n    return (result)\n\ndef hexdump_as_int(addr, count):\n    addr = gdb.Value(addr).cast(caching_lookup_type('unsigned long').pointer())\n    bytebuf = []\n    longbuf = []\n    for j in range(count):\n        ptr = addr + j\n        long = ptr.dereference()\n        longbuf.append(long)\n        bptr = gdb.Value(ptr).cast(type_unsigned_char_ptr)\n        for i in range(sizeof_ptr):\n            bytebuf.append(int((bptr + i).dereference()))\n    return (' '.join([fmt_addr(int) for long in longbuf])\n            + ' |'\n            + ''.join([as_hexdump_char(b) for b in bytebuf])\n            + '|')\n\n\nclass Table(object):\n    '''A table of text/numbers that knows how to print itself'''\n    def __init__(self, columnheadings=None, rows=[]):\n        self.numcolumns = len(columnheadings)\n        self.columnheadings = columnheadings\n        self.rows = []\n        self._colsep = '  '\n\n    def add_row(self, row):\n        assert len(row) == self.numcolumns\n        self.rows.append(row)\n\n    def write(self, out):\n        colwidths = self._calc_col_widths()\n\n        self._write_row(out, colwidths, self.columnheadings)\n\n        self._write_separator(out, colwidths)\n\n        for row in self.rows:\n            self._write_row(out, colwidths, row)\n\n    def _calc_col_widths(self):\n        result = []\n        for colIndex in range(self.numcolumns):\n            result.append(self._calc_col_width(colIndex))\n        return result\n\n    def _calc_col_width(self, idx):\n        cells = [str(row[idx]) for row in self.rows]\n        heading = self.columnheadings[idx]\n        return max([len(c) for c in (cells + [heading])])\n\n    def _write_row(self, out, colwidths, values):\n        for i, (value, width) in enumerate(zip(values, colwidths)):\n            if i > 0:\n                out.write(self._colsep)\n            formatString = \"%%%ds\" % width # to generate e.g. \"%20s\"\n            out.write(formatString % value)\n        out.write('\\n')\n\n    def _write_separator(self, out, colwidths):\n        for i, width in enumerate(colwidths):\n            if i > 0:\n                out.write(self._colsep)\n            out.write('-' * width)\n        out.write('\\n')\n\nclass UsageSet(object):\n    def __init__(self, usage_list):\n        self.usage_list = usage_list\n\n        # Ensure we can do fast lookups:\n        self.usage_by_address = dict([(int(u.start), u) for u in usage_list])\n\n    def set_addr_category(self, addr, category, level=0, visited=None, debug=False):\n        '''Attempt to mark the given address as being of the given category,\n        whilst maintaining a set of address already visited, to try to stop\n        infinite graph traveral'''\n        if visited:\n            if addr in visited:\n                if debug:\n                    print('addr 0x%x already visited (for category %r)' % (addr, category))\n                return False\n            visited.add(addr)\n\n        if addr in self.usage_by_address:\n            if debug:\n                print('addr 0x%x found (for category %r, level=%i)' % (addr, category, level))\n            u = self.usage_by_address[addr]\n            # Bail if we already have a more detailed categorization for the\n            # address:\n            if level <= u.level:\n                if debug:\n                    print ('addr 0x%x already has category %r (level %r)'\n                           % (addr, u.category, u.level))\n                return False\n            u.category = category\n            u.level = level\n            return True\n        else:\n            if debug:\n                print('addr 0x%x not found (for category %r)' % (addr, category))\n\nclass PythonCategorizer(object):\n    '''\n    Logic for categorizing buffers owned by Python objects.\n    (Done as an object to capture the type-lookup state)\n    '''\n    def __init__(self):\n        '''This will raise a TypeError if the types aren't available (e.g. not\n        a python app, or debuginfo not available'''\n        self._type_PyDictObject_ptr = caching_lookup_type('PyDictObject').pointer()\n        self._type_PyListObject_ptr = caching_lookup_type('PyListObject').pointer()\n        self._type_PySetObject_ptr = caching_lookup_type('PySetObject').pointer()\n        self._type_PyUnicodeObject_ptr = caching_lookup_type('PyUnicodeObject').pointer()\n        self._type_PyCodeObject_ptr = caching_lookup_type('PyCodeObject').pointer()\n        self._type_PyGC_Head = caching_lookup_type('PyGC_Head')\n\n    @classmethod\n    def make(cls):\n        '''Try to make a PythonCategorizer, if debuginfo is available; otherwise return None'''\n        try:\n            return cls()\n        except RuntimeError:\n            return None\n\n    def categorize(self, u, usage_set):\n        '''Try to categorize a Usage instance within an UsageSet (which could\n        lead to further categorization)'''\n        c = u.category\n        if c.domain != 'python':\n            return False\n        if u.obj:\n            if u.obj.categorize_refs(usage_set):\n                return True\n\n        if c.kind == 'list':\n            list_ptr = gdb.Value(u.start + self._type_PyGC_Head.sizeof).cast(self._type_PyListObject_ptr)\n            ob_item = int(list_ptr['ob_item'])\n            usage_set.set_addr_category(ob_item,\n                                        Category('cpython', 'PyListObject ob_item table', None))\n            return True\n\n        elif c.kind == 'set':\n            set_ptr = gdb.Value(u.start + self._type_PyGC_Head.sizeof).cast(self._type_PySetObject_ptr)\n            table = int(set_ptr['table'])\n            usage_set.set_addr_category(table,\n                                        Category('cpython', 'PySetObject setentry table', None))\n            return True\n\n        if c.kind == 'code':\n            # Python 2.6's PyCode_Type doesn't have Py_TPFLAGS_HAVE_GC:\n            code_ptr = gdb.Value(u.start).cast(self._type_PyCodeObject_ptr)\n            co_code =  int(code_ptr['co_code'])\n            usage_set.set_addr_category(co_code,\n                                        Category('python', 'str', 'bytecode'), # FIXME: on py3k this should be bytes\n                                        level=1)\n            return True\n        elif c.kind == 'sqlite3.Statement':\n            ptr_type = caching_lookup_type('pysqlite_Statement').pointer()\n            obj_ptr = gdb.Value(u.start).cast(ptr_type)\n            #print obj_ptr.dereference()\n            from heap.sqlite import categorize_sqlite3\n            for fieldname, catname, fn in (('db', 'sqlite3', categorize_sqlite3),\n                                           ('st', 'sqlite3_stmt', None)):\n                field_ptr = int(obj_ptr[fieldname])\n\n                # sqlite's src/mem1.c adds a a sqlite3_int64 (size) to the front\n                # of the allocation, so we need to look 8 bytes earlier to find\n                # the malloc-ed region:\n                malloc_ptr = field_ptr - 8\n\n                # print u, fieldname, category, field_ptr\n                if usage_set.set_addr_category(malloc_ptr, Category('sqlite3', catname)):\n                    if fn:\n                        fn(field_ptr, usage_set, set())\n            return True\n\n        elif c.kind == 'rpm.hdr':\n            ptr_type = caching_lookup_type('struct hdrObject_s').pointer()\n            if ptr_type:\n                obj_ptr = gdb.Value(u.start).cast(ptr_type)\n                # print obj_ptr.dereference()\n                h = obj_ptr['h']\n                if usage_set.set_addr_category(int(h), Category('rpm', 'Header', None)):\n                    blob = h['blob']\n                    usage_set.set_addr_category(int(blob), Category('rpm', 'Header blob', None))\n\n        elif c.kind == 'rpm.mi':\n            ptr_type = caching_lookup_type('struct rpmmiObject_s').pointer()\n            if ptr_type:\n                obj_ptr = gdb.Value(u.start).cast(ptr_type)\n                print(obj_ptr.dereference())\n                mi = obj_ptr['mi']\n                if usage_set.set_addr_category(int(mi),\n                                               Category('rpm', 'rpmdbMatchIterator', None)):\n                    pass\n                    #blob = h['blob']\n                    #usage_set.set_addr_category(int(blob), 'rpm Header blob')\n\n        # Not categorized:\n        return False\n\ndef _get_register_state():\n    from heap.compat import execute\n    return execute('thread apply all info registers')\n\n__cached_usage_list = None\n__cached_reg_state = None\n\ndef lazily_get_usage_list():\n    '''Lazily do a full-graph categorization, getting a list of Usage instances'''\n    global __cached_usage_list\n    global __cached_reg_state\n\n    reg_state = _get_register_state()\n    # print 'reg_state', reg_state\n    if __cached_usage_list and __cached_reg_state:\n        # Verify that the inferior process hasn't changed state since the cache\n        # was populated.\n        # Something of a hack: verify that all registers have the same values:\n        if reg_state == __cached_reg_state:\n            # We can use the cache:\n            # print 'USING THE CACHE'\n            return __cached_usage_list\n\n    # print 'REGENERATING THE CACHE'\n\n    # Do the work:\n    usage_list = list(iter_usage_with_progress())\n    categorize_usage_list(usage_list)\n\n    # Update the cache:\n    __cached_usage_list = usage_list\n    __cached_reg_state = reg_state\n\n    return __cached_usage_list\n\ndef categorize_usage_list(usage_list):\n    '''Do a \"full-graph\" categorization of the given list of Usage instances\n    For example, if p is a (PyDictObject*), then mark p->ma_table and p->ma_mask\n    accordingly\n    '''\n    usage_set = UsageSet(usage_list)\n    visited = set()\n\n    # Precompute some types, if available:\n    pycategorizer = PythonCategorizer.make()\n\n    for u in ProgressNotifier(iter(usage_list), 'Blocks analyzed'):\n        # Cover the simple cases, where the category can be figured out directly:\n        u.ensure_category(usage_set)\n\n        # Cross-references:\n        if u.obj:\n            if u.obj.categorize_refs(usage_set):\n                continue\n\n        # Try to categorize buffers used by python objects:\n        if pycategorizer:\n            if pycategorizer.categorize(u, usage_set):\n                continue\n\n    from heap.cpython import python_categorization\n    python_categorization(usage_set)\n\n\ndef categorize(u, usage_set):\n    '''Given an in-use block, try to guess what it's being used for\n    If usage_set is provided, this categorization may lead to further\n    categorizations'''\n    from heap.cpython import as_python_object, obj_addr_to_gc_addr\n    addr, size = u.start, u.size\n    pyop = as_python_object(addr)\n    if pyop:\n        u.obj = pyop\n        try:\n            return pyop.categorize()\n        except (RuntimeError, UnicodeEncodeError, UnicodeDecodeError):\n            # If something went wrong, assume that this wasn't really a python\n            # object, and fall through:\n            print(\"couldn't categorize pyop:\", pyop)\n            pass\n\n    # PyPy detection:\n    from heap.pypy import pypy_categorizer\n    cat = pypy_categorizer(addr, size)\n    if cat:\n        return cat\n\n    # C++ detection: only enabled if we can capture \"execute\"; there seems to\n    # be a bad interaction between pagination and redirection: all output from\n    # \"heap\" disappears in the fallback form of execute, unless we \"set pagination off\"\n    from heap.compat import has_gdb_execute_to_string\n    #  Disable for now, see https://bugzilla.redhat.com/show_bug.cgi?id=620930\n    if False: # has_gdb_execute_to_string:\n        from heap.cplusplus import get_class_name\n        cpp_cls = get_class_name(addr, size)\n        if cpp_cls:\n            return Category('C++', cpp_cls)\n\n    # GObject detection:\n    from heap.gobject import as_gtype_instance\n    ginst = as_gtype_instance(addr, size)\n    if ginst:\n        u.obj = ginst\n        return ginst.categorize()\n\n    s = as_nul_terminated_string(addr, size)\n    if s and len(s) > 2:\n        return Category('C', 'string data')\n\n    # Uncategorized:\n    return Category('uncategorized', '', '%s bytes' % size)\n\ndef as_nul_terminated_string(addr, size):\n    # Does this look like a NUL-terminated string?\n    ptr = gdb.Value(addr).cast(type_char_ptr)\n    try:\n        s = ptr.string(encoding='ascii')\n        return s\n    except (RuntimeError, UnicodeDecodeError):\n        # Probably not string data:\n        return None\n\nclass ProgressNotifier(object):\n    '''Wrap an iterable with progress notification to stdout'''\n    def __init__(self, inner, msg):\n        self.inner = inner\n        self.count = 0\n        self.msg = msg\n\n    def __iter__(self):\n        return self\n\n    def __next__(self):\n        self.count += 1\n        if 0 == self.count % 10000:\n            print(self.msg, self.count)\n        return self.inner.__next__()\n\n\n\ndef iter_usage_with_progress():\n    return ProgressNotifier(iter_usage(), 'Blocks retrieved')\n\n\nclass CachedInferiorState(object):\n    \"\"\"\n    Cached state containing information scraped from the inferior process\n    \"\"\"\n    def __init__(self):\n        self._arena_detectors = []\n\n    def add_arena_detector(self, detector):\n        self._arena_detectors.append(detector)\n\n    def detect_arena(self, ptr, chunksize):\n        '''Detect if this ptr returned by malloc is in use by any of the\n        layered allocation schemes, returning arena object if it is, None\n        if not'''\n        for detector in self._arena_detectors:\n            arena = detector.as_arena(ptr, chunksize)\n            if arena:\n                return arena\n\n        # Not found:\n        return None\n\n\ndef iter_usage():\n    # Iterate through glibc, and within that, within Python arena blocks, as appropriate\n    from heap.glibc import glibc_arenas\n    ms = glibc_arenas.get_ms()\n\n    cached_state = CachedInferiorState()\n\n    from heap.cpython import ArenaDetection as CPythonArenaDetection, PyArenaPtr, ArenaObject\n    try:\n        cpython_arenas = CPythonArenaDetection()\n        cached_state.add_arena_detector(cpython_arenas)\n    except WrongInferiorProcess:\n        pass\n\n    from heap.pypy import ArenaDetection as PyPyArenaDetection\n    try:\n        pypy_arenas = PyPyArenaDetection()\n        cached_state.add_arena_detector(pypy_arenas)\n    except WrongInferiorProcess:\n        pass\n\n    for i, chunk in enumerate(ms.iter_mmap_chunks()):\n        mem_ptr = chunk.as_mem()\n        chunksize = chunk.chunksize()\n\n        arena = cached_state.detect_arena(mem_ptr, chunksize)\n        if arena:\n            for u in arena.iter_usage():\n                yield u\n        else:\n            yield Usage(int(mem_ptr), chunksize)\n\n    for chunk in ms.iter_sbrk_chunks():\n        mem_ptr = chunk.as_mem()\n        chunksize = chunk.chunksize()\n\n        if chunk.is_inuse():\n            arena = cached_state.detect_arena(mem_ptr, chunksize)\n            if arena:\n                for u in arena.iter_usage():\n                    yield u\n            else:\n                yield Usage(int(mem_ptr), chunksize)\n\n\n\ndef looks_like_ptr(value):\n    '''Does this gdb.Value pointer's value looks reasonable?\n\n    For use when casting a block of memory to a structure on pointer fields\n    within that block of memory.\n    '''\n\n    # NULL is acceptable; assume that it's 0 on every arch we care about\n    if value == 0:\n        return True\n\n    # Assume that pointers aren't allocated in the bottom 1MB of a process'\n    # address space:\n    if value < (1024 * 1024):\n        return False\n\n    # Assume that if it got this far, that it's valid:\n    return True\n"
  },
  {
    "path": "heap/commands.py",
    "content": "# Copyright (C) 2010  David Hugh Malcolm\n#\n# This library is free software; you can redistribute it and/or\n# modify it under the terms of the GNU Lesser General Public\n# License as published by the Free Software Foundation; either\n# version 2.1 of the License, or (at your option) any later version.\n#\n# This library is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n# Lesser General Public License for more details.\n#\n# You should have received a copy of the GNU Lesser General Public\n# License along with this library; if not, write to the Free Software\n# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA\n\nimport gdb\nimport re\nimport sys\n\nfrom heap.glibc import glibc_arenas\nfrom heap.history import history, Snapshot, Diff\n\nfrom heap import lazily_get_usage_list, \\\n    fmt_size, fmt_addr, \\\n    categorize, categorize_usage_list, Usage, \\\n    hexdump_as_bytes, \\\n    Table, \\\n    MissingDebuginfo\n\ndef need_debuginfo(f):\n    def g(self, args, from_tty):\n        try:\n            return f(self, args, from_tty)\n        except MissingDebuginfo as e:\n            print('Missing debuginfo for %s' % e.module)\n            print('Suggested fix:')\n            print('    debuginfo-install %s' % e.module)\n    return g\n\nclass Heap(gdb.Command):\n    'Print a report on memory usage, by category'\n    def __init__(self):\n        gdb.Command.__init__ (self,\n                              \"heap\",\n                              gdb.COMMAND_DATA,\n                              prefix=True)\n\n    @need_debuginfo\n    def invoke(self, args, from_tty):\n        total_by_category = {}\n        count_by_category = {}\n        total_size = 0\n        total_count = 0\n        try:\n            usage_list = list(lazily_get_usage_list())\n            for u in usage_list:\n                u.ensure_category()\n                total_size += u.size\n                if u.category in total_by_category:\n                    total_by_category[u.category] += u.size\n                else:\n                    total_by_category[u.category] = u.size\n\n                total_count += 1\n                if u.category in count_by_category:\n                    count_by_category[u.category] += 1\n                else:\n                    count_by_category[u.category] = 1\n\n        except KeyboardInterrupt:\n            pass # FIXME\n\n        t = Table(['Domain', 'Kind', 'Detail', 'Count', 'Allocated size'])\n        for category in sorted(total_by_category.keys(),\n                               key=total_by_category.get,\n                               reverse=True):\n            detail = category.detail\n            if not detail:\n                detail = ''\n            t.add_row([category.domain,\n                       category.kind,\n                       detail,\n                       fmt_size(count_by_category[category]),\n                       fmt_size(total_by_category[category]),\n                       ])\n        t.add_row(['', '', 'TOTAL', fmt_size(total_count), fmt_size(total_size)])\n        t.write(sys.stdout)\n        print()\n\nclass HeapSizes(gdb.Command):\n    'Print a report on memory usage, by sizes'\n    def __init__(self):\n        gdb.Command.__init__ (self,\n                              \"heap sizes\",\n                              gdb.COMMAND_DATA)\n    @need_debuginfo\n    def invoke(self, args, from_tty):\n        ms = glibc_arenas.get_ms()\n        chunks_by_size = {}\n        num_chunks = 0\n        total_size = 0\n        try:\n            for chunk in ms.iter_chunks():\n                if not chunk.is_inuse():\n                    continue\n                size = int(chunk.chunksize())\n                num_chunks += 1\n                total_size += size\n                if size in chunks_by_size:\n                    chunks_by_size[size] += 1\n                else:\n                    chunks_by_size[size] = 1\n        except KeyboardInterrupt:\n            pass # FIXME\n        t = Table(['Chunk size', 'Num chunks', 'Allocated size'])\n        for size in sorted(chunks_by_size.keys(),\n                           key=lambda s1: chunks_by_size[s1] * s1,\n                           reverse=True):\n            t.add_row([fmt_size(size),\n                       chunks_by_size[size],\n                       fmt_size(chunks_by_size[size] * size)])\n        t.add_row(['TOTALS', num_chunks, fmt_size(total_size)])\n        t.write(sys.stdout)\n        print()\n\n\nclass HeapUsed(gdb.Command):\n    'Print used heap chunks'\n    def __init__(self):\n        gdb.Command.__init__ (self,\n                              \"heap used\",\n                              gdb.COMMAND_DATA)\n\n    @need_debuginfo\n    def invoke(self, args, from_tty):\n        print('Used chunks of memory on heap')\n        print('-----------------------------')\n        ms = glibc_arenas.get_ms()\n        for i, chunk in enumerate(ms.iter_chunks()):\n            if not chunk.is_inuse():\n                continue\n            size = chunk.chunksize()\n            mem = chunk.as_mem()\n            u = Usage(mem, size)\n            category = categorize(u, None)\n            hd = hexdump_as_bytes(mem, 32)\n            print ('%6i: %s -> %s %8i bytes %20s |%s'\n                   % (i,\n                      fmt_addr(chunk.as_mem()),\n                      fmt_addr(chunk.as_mem()+size-1),\n                      size, category, hd))\n        print()\n\nclass HeapFree(gdb.Command):\n    'Print free heap chunks'\n    def __init__(self):\n        gdb.Command.__init__ (self,\n                              \"heap free\",\n                              gdb.COMMAND_DATA)\n\n    @need_debuginfo\n    def invoke(self, args, from_tty):\n        print('Free chunks of memory on heap')\n        print('-----------------------------')\n        ms = glibc_arenas.get_ms()\n        total_size = 0\n\n        for i, chunk in enumerate(ms.iter_free_chunks()):\n            size = chunk.chunksize()\n            total_size += size\n            mem = chunk.as_mem()\n            u = Usage(mem, size)\n            category = categorize(u, None)\n            hd = hexdump_as_bytes(mem, 32)\n\n            print ('%6i: %s -> %s %8i bytes %20s |%s'\n                   % (i,\n                      fmt_addr(chunk.as_mem()),\n                      fmt_addr(chunk.as_mem()+size-1),\n                      size, category, hd))\n\n        print(\"Total size: %s\" % total_size)\n\n\nclass HeapAll(gdb.Command):\n    'Print all heap chunks'\n    def __init__(self):\n        gdb.Command.__init__ (self,\n                              \"heap all\",\n                              gdb.COMMAND_DATA)\n\n    @need_debuginfo\n    def invoke(self, args, from_tty):\n        print('All chunks of memory on heap (both used and free)')\n        print('-------------------------------------------------')\n        ms = glibc_arenas.get_ms()\n        for i, chunk in enumerate(ms.iter_chunks()):\n            size = chunk.chunksize()\n            if chunk.is_inuse():\n                kind = ' inuse'\n            else:\n                kind = ' free'\n\n            print ('%i: %s -> %s %s: %i bytes (%s)'\n                   % (i,\n                      fmt_addr(chunk.as_address()),\n                      fmt_addr(chunk.as_address()+size-1),\n                      kind, size, chunk))\n        print()\n\nclass HeapLog(gdb.Command):\n    'Print a log of recorded heap states'\n    def __init__(self):\n        gdb.Command.__init__ (self,\n                              \"heap log\",\n                              gdb.COMMAND_DATA)\n\n    @need_debuginfo\n    def invoke(self, args, from_tty):\n        h = history\n        if len(h.snapshots) == 0:\n            print('(no history)')\n            return\n        for i in range(len(h.snapshots), 0, -1):\n            s = h.snapshots[i-1]\n            print('Label %i \"%s\" at %s' % (i, s.name, s.time))\n            print('    ', s.summary())\n            if i > 1:\n                prev = h.snapshots[i-2]\n                d = Diff(prev, s)\n                print()\n                print('    ', d.stats())\n            print()\n\nclass HeapLabel(gdb.Command):\n    'Record the current state of the heap for later comparison'\n    def __init__(self):\n        gdb.Command.__init__ (self,\n                              \"heap label\",\n                              gdb.COMMAND_DATA)\n\n    @need_debuginfo\n    def invoke(self, args, from_tty):\n        s = history.add(args)\n        print(s.summary())\n\n\nclass HeapDiff(gdb.Command):\n    'Compare two states of the heap'\n    def __init__(self):\n        gdb.Command.__init__ (self,\n                              \"heap diff\",\n                              gdb.COMMAND_DATA)\n\n    @need_debuginfo\n    def invoke(self, args, from_tty):\n        h = history\n        if len(h.snapshots) == 0:\n            print('(no history)')\n            return\n        prev = h.snapshots[-1]\n        curr = Snapshot.current('current')\n        d = Diff(prev, curr)\n        print('Changes from %s to %s' % (prev.name, curr.name))\n        print('  ', d.stats())\n        print()\n        print('\\n'.join(['  ' + line for line in d.as_changes().splitlines()]))\n\nclass HeapSelect(gdb.Command):\n    'Query used heap chunks'\n    def __init__(self):\n        gdb.Command.__init__ (self,\n                              \"heap select\",\n                              gdb.COMMAND_DATA)\n\n    @need_debuginfo\n    def invoke(self, args, from_tty):\n        from heap.query import do_query\n        from heap.parser import ParserError\n        try:\n            do_query(args)\n        except ParserError as e:\n            print(e)\n\nclass Hexdump(gdb.Command):\n    'Print a hexdump, starting at the specific region of memory'\n    def __init__(self):\n        gdb.Command.__init__ (self,\n                              \"hexdump\",\n                              gdb.COMMAND_DATA)\n\n    def invoke(self, args, from_tty):\n        print(repr(args))\n        arg_list = gdb.string_to_argv(args)\n\n        chars_only = True\n\n        if len(arg_list) == 2:\n            addr_arg = arg_list[0]\n            chars_only = True if args[1] == '-c' else False\n        else:\n            addr_arg = args\n\n        if addr_arg.startswith('0x'):\n            addr = int(addr_arg, 16)\n        else:\n            addr = int(addr_arg)\n\n        # assume that paging will cut in and the user will quit at some point:\n        size = 32\n        while True:\n            hd = hexdump_as_bytes(addr, size, chars_only=chars_only)\n            print ('%s -> %s %s' % (fmt_addr(addr), fmt_addr(addr + size -1), hd))\n            addr += size\n\nclass HeapArenas(gdb.Command):\n    'Display heap arenas available'\n    def __init__(self):\n        gdb.Command.__init__ (self,\n                              \"heap arenas\",\n                              gdb.COMMAND_DATA)\n\n    @need_debuginfo\n    def invoke(self, args, from_tty):\n        for n, arena in enumerate(glibc_arenas.arenas):\n            print(\"Arena #%d: %s\" % (n, arena.address))\n\nclass HeapArenaSelect(gdb.Command):\n    'Select heap arena'\n    def __init__(self):\n        gdb.Command.__init__ (self,\n                              \"heap arena\",\n                              gdb.COMMAND_DATA)\n\n    @need_debuginfo\n    def invoke(self, args, from_tty):\n        arena_num = int(args)\n\n        glibc_arenas.cur_arena = glibc_arenas.arenas[arena_num]\n        print(\"Arena set to %s\" % glibc_arenas.cur_arena.address)\n\n\n\ndef register_commands():\n    # Register the commands with gdb\n    Heap()\n    HeapSizes()\n    HeapUsed()\n    HeapFree()\n    HeapAll()\n    HeapLog()\n    HeapLabel()\n    HeapDiff()\n    HeapSelect()\n    HeapArenas()\n    HeapArenaSelect()\n    Hexdump()\n\n    from heap.cpython import register_commands as register_cpython_commands\n    register_cpython_commands()\n"
  },
  {
    "path": "heap/compat.py",
    "content": "# Copyright (C) 2010  David Hugh Malcolm\n#\n# This library is free software; you can redistribute it and/or\n# modify it under the terms of the GNU Lesser General Public\n# License as published by the Free Software Foundation; either\n# version 2.1 of the License, or (at your option) any later version.\n#\n# This library is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n# Lesser General Public License for more details.\n#\n# You should have received a copy of the GNU Lesser General Public\n# License along with this library; if not, write to the Free Software\n# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA\n\n'''\ngdb versions vary greatly, this is a central place to\ndeal with varying capabilities of the underlying gdb and its python bindings\n'''\nimport gdb\n\n# gdb.execute's to_string keyword argument was added between F13 and F14.\n# See https://bugzilla.redhat.com/show_bug.cgi?id=610241\n\nhas_gdb_execute_to_string = True\ntry:\n    # This will either capture the result, or fail before executing,\n    # so in neither case should we get noise on stdout:\n    gdb.execute('info registers', to_string=True)\nexcept TypeError:\n    has_gdb_execute_to_string = False\n\ndef execute(command):\n    '''Equivalent to gdb.execute(to_string=True), returning the output as\n    a string rather than logging it to stdout.\n\n    On gdb versions lacking this capability, it uses redirection and temporary\n    files to achieve the same result'''\n    if has_gdb_execute_to_string:\n        return gdb.execute(command, to_string = True)\n    else:\n        import tempfile\n        f = tempfile.NamedTemporaryFile('r', delete=True)\n        gdb.execute(\"set logging off\")\n        gdb.execute(\"set logging redirect off\")\n        gdb.execute(\"set logging file %s\" % f.name)\n        gdb.execute(\"set logging redirect on\")\n        gdb.execute(\"set logging on\")\n        gdb.execute(command)\n        gdb.execute(\"set logging off\")\n        gdb.execute(\"set logging redirect off\")\n        result = f.read()\n        f.close()\n        return result\n\ndef dump():\n    print ('Does gdb.execute have an \"to_string\" keyword argument? : %s' \n           % has_gdb_execute_to_string)\n\n\n\n\n"
  },
  {
    "path": "heap/cplusplus.py",
    "content": "# Copyright (C) 2010  David Hugh Malcolm\n#\n# This library is free software; you can redistribute it and/or\n# modify it under the terms of the GNU Lesser General Public\n# License as published by the Free Software Foundation; either\n# version 2.1 of the License, or (at your option) any later version.\n#\n# This library is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n# Lesser General Public License for more details.\n#\n# You should have received a copy of the GNU Lesser General Public\n# License along with this library; if not, write to the Free Software\n# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA\n\n# C++ support\nimport re\n\nimport gdb\n\nfrom heap import caching_lookup_type, looks_like_ptr\nfrom heap.compat import execute\n\nvoid_ptr_ptr = caching_lookup_type('void').pointer().pointer()\n\ndef get_class_name(addr, size):\n    # Try to detect a vtable ptr at the top of this object:\n    vtable = gdb.Value(addr).cast(void_ptr_ptr).dereference()\n    if not looks_like_ptr(vtable):\n        return None\n\n    info = execute('info sym (void *)0x%x' % int(vtable))\n    # \"vtable for Foo + 8 in section .rodata of /home/david/heap/test_cplusplus\"\n    m = re.match('vtable for (.*) \\+ (.*)', info)\n    if m:\n        return m.group(1)\n    # Not matched:\n    return None\n\n\ndef as_cplusplus_object(addr, size):\n    print(get_class_name(addr))\n    pass\n"
  },
  {
    "path": "heap/cpython.py",
    "content": "'''\nThis file is licensed under the PSF license\n'''\nimport sys\nimport gdb\nfrom heap import WrappedPointer, caching_lookup_type, Usage, \\\n    type_void_ptr, fmt_addr, Category, looks_like_ptr, \\\n    WrongInferiorProcess, Table\n\n\nSIZEOF_VOID_P = type_void_ptr.sizeof\n\n# Transliteration from Python's obmalloc.c:\nALIGNMENT             = 8\nALIGNMENT_SHIFT       = 3\nALIGNMENT_MASK        = (ALIGNMENT - 1)\n\n# Return the number of bytes in size class I:\ndef INDEX2SIZE(I):\n    return (I + 1) << ALIGNMENT_SHIFT\n\nSYSTEM_PAGE_SIZE      = (4 * 1024)\nSYSTEM_PAGE_SIZE_MASK = (SYSTEM_PAGE_SIZE - 1)\nARENA_SIZE            = (256 << 10)\nPOOL_SIZE             = SYSTEM_PAGE_SIZE\nPOOL_SIZE_MASK        = SYSTEM_PAGE_SIZE_MASK\ndef ROUNDUP(x):\n    return (x + ALIGNMENT_MASK) & ~ALIGNMENT_MASK\n\ndef POOL_OVERHEAD():\n    return ROUNDUP(caching_lookup_type('struct pool_header').sizeof)\n\nclass PyArenaPtr(WrappedPointer):\n    # Wrapper around a (void*) that's a Python arena's buffer (the\n    # arena->address, as opposed to the (struct arena_object*) itself)\n    @classmethod\n    def from_addr(cls, p, arenaobj):\n        ptr = gdb.Value(p)\n        ptr = ptr.cast(type_void_ptr)\n        return cls(ptr, arenaobj)\n\n    def __init__(self, gdbval, arenaobj):\n        WrappedPointer.__init__(self, gdbval)\n\n        assert(isinstance(arenaobj, ArenaObject))\n        self.arenaobj = arenaobj\n\n        # obmalloc.c sets up arenaobj->pool_address to the first pool\n        # address, aligning it to POOL_SIZE_MASK:\n        self.initial_pool_addr = self.as_address()\n        self.num_pools = ARENA_SIZE / POOL_SIZE\n        self.excess = self.initial_pool_addr & POOL_SIZE_MASK\n        if self.excess != 0:\n            self.num_pools -= 1\n            self.initial_pool_addr += POOL_SIZE - self.excess\n\n    def __str__(self):\n        return ('PyArenaPtr([%s->%s], %i pools [%s->%s], excess: %i tracked by %s)'\n                % (fmt_addr(self.as_address()),\n                   fmt_addr(self.as_address() + ARENA_SIZE - 1),\n                   self.num_pools,\n                   fmt_addr(self.initial_pool_addr),\n                   fmt_addr(self.initial_pool_addr\n                            + (self.num_pools * POOL_SIZE) - 1),\n                   self.excess,\n                   self.arenaobj\n                   )\n                )\n\n    def iter_pools(self):\n        '''Yield a sequence of PyPoolPtr, representing all of the pools within\n        this arena'''\n        # print 'num_pools:', num_pools\n        pool_addr = self.initial_pool_addr\n        for idx in range(self.num_pools):\n\n            # \"pool_address\" is a high-water-mark for activity within the arena;\n            # pools at this location or beyond haven't been initialized yet:\n            if pool_addr >= self.arenaobj.pool_address:\n                return\n\n            pool = PyPoolPtr.from_addr(pool_addr)\n            yield pool\n            pool_addr += POOL_SIZE\n\n    def iter_usage(self):\n        '''Yield a series of Usage instances'''\n        if self.excess != 0:\n            # FIXME: this size is wrong\n            yield Usage(self.as_address(), self.excess, Category('pyarena', 'alignment wastage'))\n\n        for pool in self.iter_pools():\n            # print 'pool:', pool\n            for u in pool.iter_usage():\n                yield u\n\n        # FIXME: unused space (if any) between pool_address and the alignment top\n\n        # if self.excess != 0:\n        #    # FIXME: this address is wrong\n        #    yield Usage(self.as_address(), self.excess, Category('pyarena', 'alignment wastage'))\n\n\nclass PyPoolPtr(WrappedPointer):\n    # Wrapper around Python's obmalloc.c: poolp: (struct pool_header *)\n\n    @classmethod\n    def from_addr(cls, p):\n        ptr = gdb.Value(p)\n        ptr = ptr.cast(cls.gdb_type())\n        return cls(ptr)\n\n    def __str__(self):\n        return ('PyPoolPtr([%s->%s: %d blocks of size %i bytes))'\n                % (fmt_addr(self.as_address()), fmt_addr(self.as_address() + POOL_SIZE - 1),\n                   self.num_blocks(), self.block_size()))\n\n    @classmethod\n    def gdb_type(cls):\n        # Deferred lookup of the \"poolp\" type:\n        return caching_lookup_type('poolp')\n\n    def block_size(self):\n        return INDEX2SIZE(self.field('szidx'))\n\n    def num_blocks(self):\n        firstoffset = self._firstoffset()\n        maxnextoffset = self._maxnextoffset()\n        offsetrange = maxnextoffset - firstoffset\n        return offsetrange / self.block_size() # FIXME: not exactly correctly\n\n    def _firstoffset(self):\n        return POOL_OVERHEAD()\n\n    def _maxnextoffset(self):\n        return POOL_SIZE - self.block_size()\n\n    def iter_blocks(self):\n        '''Yield all blocks within this pool, whether free or in use'''\n        size = self.block_size()\n        maxnextoffset = self._maxnextoffset()\n        # print initnextoffset, maxnextoffset\n        offset = self._firstoffset()\n        base_addr = self.as_address()\n        while offset <= maxnextoffset:\n            yield (base_addr + offset, size)\n            offset += size\n\n    def iter_usage(self):\n        # The struct pool_header at the front:\n        yield Usage(self.as_address(),\n                    POOL_OVERHEAD(),\n                    Category('pyarena', 'pool_header overhead'))\n\n        fb = list(self.iter_free_blocks())\n        for (start, size) in fb:\n            yield Usage(start, size, Category('pyarena', 'freed pool chunk'))\n\n        for (start, size) in self.iter_used_blocks():\n            if (start, size) not in fb:\n                yield Usage(start, size) #, 'python pool: ' + categorize(start, size, None))\n\n        # FIXME: yield any wastage at the end\n\n    def iter_free_blocks(self):\n        '''Yield the sequence of free blocks within this pool.  Doesn't include\n        the areas after nextoffset that have never been allocated'''\n        # print self._gdbval.dereference()\n        size = self.block_size()\n        freeblock = self.field('freeblock')\n        _type_block_ptr_ptr = caching_lookup_type('unsigned char').pointer().pointer()\n        # Walk the singly-linked list of free blocks for this chunk\n        while int(freeblock) != 0:\n            # print 'freeblock:', (fmt_addr(int(freeblock)), int(size))\n            yield (int(freeblock), int(size))\n            freeblock = freeblock.cast(_type_block_ptr_ptr).dereference()\n\n    def _free_blocks(self):\n        # Get the set of addresses of free blocks\n        return set([addr for addr, size in self.iter_free_blocks()])\n\n    def iter_used_blocks(self):\n        '''Yield the sequence of currently in-use blocks within this pool'''\n        # We'll filter out the free blocks from the list:\n        free_block_addresses = self._free_blocks()\n\n        size = self.block_size()\n        initnextoffset = self._firstoffset()\n        nextoffset = self.field('nextoffset')\n        #print initnextoffset, nextoffset\n        offset = initnextoffset\n        base_addr = self.as_address()\n        # Iterate upwards until you reach \"pool->nextoffset\": blocks beyond\n        # that point have never been allocated:\n        while offset < nextoffset:\n            addr = base_addr + offset\n            # Filter out those within this pool's linked list of free blocks:\n            if int(addr) not in free_block_addresses:\n                yield (int(addr), int(size))\n            offset += size\n\n\nPy_TPFLAGS_HEAPTYPE = (1 << 9)\n\nPy_TPFLAGS_INT_SUBCLASS      = (1 << 23)\nPy_TPFLAGS_LONG_SUBCLASS     = (1 << 24)\nPy_TPFLAGS_LIST_SUBCLASS     = (1 << 25)\nPy_TPFLAGS_TUPLE_SUBCLASS    = (1 << 26)\nPy_TPFLAGS_STRING_SUBCLASS   = (1 << 27)\nPy_TPFLAGS_UNICODE_SUBCLASS  = (1 << 28)\nPy_TPFLAGS_DICT_SUBCLASS     = (1 << 29)\nPy_TPFLAGS_BASE_EXC_SUBCLASS = (1 << 30)\nPy_TPFLAGS_TYPE_SUBCLASS     = (1 << 31)\n\nclass PyObjectPtr(WrappedPointer):\n    @classmethod\n    def from_pyobject_ptr(cls, addr):\n        ob_type = addr['ob_type']\n        tp_flags = ob_type['tp_flags']\n        if tp_flags & Py_TPFLAGS_HEAPTYPE:\n            return HeapTypeObjectPtr(addr)\n\n        if tp_flags & Py_TPFLAGS_UNICODE_SUBCLASS:\n            return PyUnicodeObjectPtr(addr.cast(caching_lookup_type('PyUnicodeObject').pointer()))\n\n        if tp_flags & Py_TPFLAGS_DICT_SUBCLASS:\n            return PyDictObjectPtr(addr.cast(caching_lookup_type('PyDictObject').pointer()))\n\n        tp_name = ob_type['tp_name'].string()\n        if tp_name == 'instance':\n            __type_PyInstanceObjectPtr = caching_lookup_type('PyInstanceObject').pointer()\n            return PyInstanceObjectPtr(addr.cast(__type_PyInstanceObjectPtr))\n\n        return PyObjectPtr(addr)\n\n    def type(self):\n        return PyTypeObjectPtr(self.field('ob_type'))\n\n    def safe_tp_name(self):\n        try:\n            return self.type().field('tp_name').string()\n        except(RuntimeError, UnicodeDecodeError):\n            # Can't even read the object at all?\n            return 'unknown'\n\n    def categorize(self):\n        # Python objects will be categorized as (\"python\", tp_name), but\n        # old-style classes have to do more work\n        return Category('python', self.safe_tp_name())\n\n    def as_malloc_addr(self):\n        ob_type = addr['ob_type']\n        tp_flags = ob_type['tp_flags']\n        addr = int(self._gdbval)\n        if tp_flags & Py_TPFLAGS_: # FIXME\n            return obj_addr_to_gc_addr(addr)\n        else:\n            return addr\n\n# Taken from my libpython.py code in python's Tools/gdb/libpython.py\n# FIXME: ideally should share code somehow\ndef _PyObject_VAR_SIZE(typeobj, nitems):\n    type_size_t = caching_lookup_type('size_t')\n    return ( ( typeobj.field('tp_basicsize') +\n               nitems * typeobj.field('tp_itemsize') +\n               (SIZEOF_VOID_P - 1)\n             ) & ~(SIZEOF_VOID_P - 1)\n           ).cast(type_size_t)\ndef int_from_int(gdbval):\n    return int(gdbval)\n\nclass PyUnicodeObjectPtr(PyObjectPtr):\n    \"\"\"\n    Class wrapping a gdb.Value that's a PyUnicodeObject* within the process\n    being debugged.\n    \"\"\"\n    _typename = 'PyUnicodeObject'\n\n    def categorize_refs(self, usage_set, level=0, detail=None):\n        m_str = int(self.field('str'))\n        usage_set.set_addr_category(m_str,\n                                    Category('cpython', 'PyUnicodeObject buffer', detail),\n                                    level)\n        return True\n\nclass PyDictObjectPtr(PyObjectPtr):\n    \"\"\"\n    Class wrapping a gdb.Value that's a PyDictObject* i.e. a dict instance\n    within the process being debugged.\n    \"\"\"\n    _typename = 'PyDictObject'\n\n    def categorize_refs(self, usage_set, level=0, detail=None):\n        ma_table = int(self.field('ma_table'))\n        usage_set.set_addr_category(ma_table,\n                                    Category('cpython', 'PyDictEntry table', detail),\n                                    level)\n        return True\n\nclass PyInstanceObjectPtr(PyObjectPtr):\n    _typename = 'PyInstanceObject'\n\n    def cl_name(self):\n        in_class = self.field('in_class')\n        # cl_name is a python string, not a char*; rely on\n        # prettyprinters for now:\n        cl_name = str(in_class['cl_name'])[1:-1]\n        return cl_name\n\n    def categorize(self):\n        return Category('python', self.cl_name(), 'old-style')\n\n    def categorize_refs(self, usage_set, level=0, detail=None):\n        cl_name = self.cl_name()\n        # print 'cl_name', cl_name\n\n        # Visit the in_dict:\n        in_dict = self.field('in_dict')\n        # print 'in_dict', in_dict\n\n        dict_detail = '%s.__dict__' % cl_name\n\n        # Mark the ptr as being a dictionary, adding detail\n        usage_set.set_addr_category(obj_addr_to_gc_addr(in_dict),\n                                    Category('cpython', 'PyDictObject', dict_detail),\n                                    level=1)\n\n        # Visit ma_table:\n        _type_PyDictObject_ptr = caching_lookup_type('PyDictObject').pointer()\n        in_dict = in_dict.cast(_type_PyDictObject_ptr)\n\n        ma_table = int(in_dict['ma_table'])\n\n        # Record details:\n        usage_set.set_addr_category(ma_table,\n                                    Category('cpython', 'PyDictEntry table', dict_detail),\n                                    level=2)\n        return True\n\nclass PyTypeObjectPtr(PyObjectPtr):\n    _typename = 'PyTypeObject'\n\nclass HeapTypeObjectPtr(PyObjectPtr):\n    _typename = 'PyObject'\n\n    def categorize_refs(self, usage_set, level=0, detail=None):\n        attr_dict = self.get_attr_dict()\n        if attr_dict:\n            # Mark the dictionary's \"detail\" with our typename\n            # gdb.execute('print (PyObject*)0x%x' % int(attr_dict._gdbval))\n            usage_set.set_addr_category(obj_addr_to_gc_addr(attr_dict._gdbval),\n                                        Category('python', 'dict', '%s.__dict__' % self.safe_tp_name()),\n                                        level=level+1)\n\n            # and mark the dict's PyDictEntry with our typename:\n            attr_dict.categorize_refs(usage_set, level=level+1,\n                                      detail='%s.__dict__' % self.safe_tp_name())\n        return True\n\n    def get_attr_dict(self):\n        '''\n        Get the PyDictObject ptr representing the attribute dictionary\n        (or None if there's a problem)\n        '''\n        from heap import type_char_ptr\n        try:\n            typeobj = self.type()\n            dictoffset = int_from_int(typeobj.field('tp_dictoffset'))\n            if dictoffset != 0:\n                if dictoffset < 0:\n                    type_PyVarObject_ptr = caching_lookup_type('PyVarObject').pointer()\n                    tsize = int_from_int(self._gdbval.cast(type_PyVarObject_ptr)['ob_size'])\n                    if tsize < 0:\n                        tsize = -tsize\n                    size = _PyObject_VAR_SIZE(typeobj, tsize)\n                    dictoffset += size\n                    assert dictoffset > 0\n                    if dictoffset % SIZEOF_VOID_P != 0:\n                        # Corrupt somehow?\n                        return None\n\n                dictptr = self._gdbval.cast(type_char_ptr) + dictoffset\n                PyObjectPtrPtr = caching_lookup_type('PyObject').pointer().pointer()\n                dictptr = dictptr.cast(PyObjectPtrPtr)\n                return PyObjectPtr.from_pyobject_ptr(dictptr.dereference())\n        except RuntimeError:\n            # Corrupt data somewhere; fail safe\n            pass\n\n        # Not found, or some kind of error:\n        return None\n\ndef is_pyobject_ptr(addr):\n    try:\n        _type_pyop = caching_lookup_type('PyObject').pointer()\n        _type_pyvarop = caching_lookup_type('PyVarObject').pointer()\n    except RuntimeError:\n        # not linked against python\n        return None\n\n    pyop = gdb.Value(addr).cast(_type_pyop)\n    try:\n        ob_refcnt = pyop['ob_refcnt']\n        if ob_refcnt >=0 and ob_refcnt < 0xffff:\n            obtype = pyop['ob_type']\n            if obtype != 0:\n                type_refcnt = obtype.cast(_type_pyop)['ob_refcnt']\n                if type_refcnt > 0 and type_refcnt < 0xffff:\n                    type_ob_size = obtype.cast(_type_pyvarop)['ob_size']\n\n                    if type_ob_size > 0xffff:\n                        return 0\n\n                    for fieldname in ('tp_del', 'tp_mro', 'tp_init', 'tp_getset'):\n                        if not looks_like_ptr(obtype[fieldname]):\n                            return 0\n\n                    # Then this looks like a Python object:\n                    return PyObjectPtr.from_pyobject_ptr(pyop)\n\n    except (RuntimeError, UnicodeDecodeError):\n        pass # Not a python object (or corrupt)\n\n    # Doesn't look like a python object, implicit return None\n\ndef obj_addr_to_gc_addr(addr):\n    '''Given a PyObject* address, convert to a PyGC_Head* address\n    (i.e. the allocator's view of the same)'''\n    #print 'obj_addr_to_gc_addr(%s)' % fmt_addr(int(addr))\n    _type_PyGC_Head = caching_lookup_type('PyGC_Head')\n    return int(addr) - _type_PyGC_Head.sizeof\n\ndef as_python_object(addr):\n    '''Given an address of an allocation, determine if it holds a PyObject,\n    or a PyGC_Head\n\n    Return a WrappedPointer for the PyObject* if it does (which might have a\n    different location c.f. when PyGC_Head was allocated)\n\n    Return None if it doesn't look like a PyObject*'''\n    # Try casting to PyObject* ?\n    # FIXME: what about the debug allocator?\n    try:\n        _type_pyop = caching_lookup_type('PyObject').pointer()\n        _type_PyGC_Head = caching_lookup_type('PyGC_Head')\n    except RuntimeError:\n        # not linked against python\n        return None\n    pyop = is_pyobject_ptr(addr)\n    if pyop:\n        return pyop\n    else:\n        # maybe a GC type:\n        _type_PyGC_Head_ptr = _type_PyGC_Head.pointer()\n        gc_ptr = gdb.Value(addr).cast(_type_PyGC_Head_ptr)\n        # print gc_ptr.dereference()\n\n        PYGC_REFS_REACHABLE = -3\n\n        if gc_ptr['gc']['gc_refs'] == PYGC_REFS_REACHABLE:  # FIXME: need to cover other values\n            pyop = is_pyobject_ptr(gdb.Value(addr + _type_PyGC_Head.sizeof))\n            if pyop:\n                return pyop\n    # Doesn't look like a python object, implicit return None\n\n\nclass ArenaObject(WrappedPointer):\n    '''\n    Wrapper around Python's struct arena_object*\n    Note that this is record-keeping for an arena, not the\n    memory itself\n    '''\n    @classmethod\n    def iter_arenas(cls):\n        try:\n            val_arenas = gdb.parse_and_eval('arenas')\n            val_maxarenas = gdb.parse_and_eval('maxarenas')\n        except RuntimeError:\n            # Not linked against python, or no debug information:\n            raise WrongInferiorProcess('cpython')\n\n        try:\n            for i in range(val_maxarenas):\n                # Look up \"&arenas[i]\":\n                obj = ArenaObject(val_arenas[i].address)\n\n                # obj->address == 0 indicates an unused entry within the \"arenas\" array:\n                if obj.address != 0:\n                    yield obj\n        except RuntimeError:\n            # pypy also has a symbol named \"arenas\", of type \"long unsigned int * volatile\"\n            # For now, ignore it:\n            return\n\n    @property  # need to override the base property\n    def address(self):\n        return self.field('address')\n\n    def __init__(self, gdbval):\n        WrappedPointer.__init__(self, gdbval)\n\n        # Cache some values:\n        # This is the high-water mark: at this point and beyond, the bytes of\n        # memory are untouched since malloc:\n        self.pool_address = self.field('pool_address')\n\n\nclass ArenaDetection(object):\n    '''Detection of CPython arenas, done as an object so that we can cache state'''\n    def __init__(self):\n        self.arenaobjs = list(ArenaObject.iter_arenas())\n\n    def as_arena(self, ptr, chunksize):\n        '''Detect if this ptr returned by malloc is in use as a Python arena,\n        returning PyArenaPtr if it is, None if not'''\n        # Fast rejection of too-small chunks:\n        if chunksize < (256 * 1024):\n            return None\n\n        for arenaobj in self.arenaobjs:\n            if ptr == arenaobj.address:\n                # Found it:\n                return PyArenaPtr.from_addr(ptr, arenaobj)\n\n        # Not found:\n        return None\n\n\ndef python_categorization(usage_set):\n    # special-cased categorization for CPython\n\n    # The Objects/stringobject.c:interned dictionary is typically large,\n    # with its PyDictEntry table occuping 200k on a 64-bit build of python 2.6\n    # Identify it:\n    try:\n        val_interned = gdb.parse_and_eval('interned')\n        pyop = PyDictObjectPtr.from_pyobject_ptr(val_interned)\n        ma_table = int(pyop.field('ma_table'))\n        usage_set.set_addr_category(ma_table,\n                                    Category('cpython', 'PyDictEntry table', 'interned'),\n                                    level=1)\n    except RuntimeError:\n        pass\n\n    # Various kinds of per-type optimized allocator\n    # See Modules/gcmodule.c:clear_freelists\n\n    # The Objects/intobject.c: block_list\n    try:\n        val_block_list = gdb.parse_and_eval('block_list')\n        if str(val_block_list.type.target()) != 'PyIntBlock':\n            raise RuntimeError\n        while int(val_block_list) != 0:\n            usage_set.set_addr_category(int(val_block_list),\n                                        Category('cpython', '_intblock', ''),\n                                        level=0)\n            val_block_list = val_block_list['next']\n\n    except RuntimeError:\n        pass\n\n    # The Objects/floatobject.c: block_list\n    # TODO: how to get at this? multiple vars named \"block_list\"\n\n    # Objects/methodobject.c: PyCFunction_ClearFreeList\n    #   \"free_list\" of up to 256 PyCFunctionObject, but they're still of\n    #   that type\n\n    # Objects/classobject.c: PyMethod_ClearFreeList\n    #   \"free_list\" of up to 256 PyMethodObject, but they're still of that type\n\n    # Objects/frameobject.c: PyFrame_ClearFreeList\n    #   \"free_list\" of up to 300 PyFrameObject, but they're still of that type\n\n    # Objects/tupleobject.c: array of free_list: up to 2000 free tuples of each\n    # size from 1-20 (using ob_item[0] to chain up); singleton for size 0; they\n    # are still tuples when deallocated, though\n\n    # Objects/unicodeobject.c:\n    #   \"free_list\" of up to 1024 PyUnicodeObject, with the \"str\" buffer\n    #   optionally preserved also for lengths up to 9\n    #   They're all still of type \"unicode\" when free\n    #   Singletons for the empty unicode string, and for the first 256 code\n    #   points (Latin-1)\n\n# New gdb commands, specific to CPython\n\nfrom heap.commands import need_debuginfo\n\n\nclass HeapCPythonAllocators(gdb.Command):\n    \"For CPython: display information on the allocators\"\n    def __init__(self):\n        gdb.Command.__init__ (self,\n                              \"heap cpython-allocators\",\n                              gdb.COMMAND_DATA)\n\n    @need_debuginfo\n    def invoke(self, args, from_tty):\n        t = Table(columnheadings=('struct arena_object*', '256KB buffer location', 'Free pools'))\n        for arena in ArenaObject.iter_arenas():\n            t.add_row([fmt_addr(arena.as_address()),\n                       fmt_addr(arena.address),\n                       '%i / %i ' % (arena.field('nfreepools'),\n                                     arena.field('ntotalpools'))\n                       ])\n        print('Objects/obmalloc.c: %i arenas' % len(t.rows))\n        t.write(sys.stdout)\n        print()\n\n\ndef register_commands():\n    HeapCPythonAllocators()\n"
  },
  {
    "path": "heap/glibc.py",
    "content": "# Copyright (C) 2010  David Hugh Malcolm\n#\n# This library is free software; you can redistribute it and/or\n# modify it under the terms of the GNU Lesser General Public\n# License as published by the Free Software Foundation; either\n# version 2.1 of the License, or (at your option) any later version.\n#\n# This library is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n# Lesser General Public License for more details.\n#\n# You should have received a copy of the GNU Lesser General Public\n# License along with this library; if not, write to the Free Software\n# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA\n\n'''\ngdb 7 hooks for glibc's heap implementation\n\nSee /usr/src/debug/glibc-*/malloc/\ne.g. /usr/src/debug/glibc-2.11.1/malloc/malloc.h and /usr/src/debug/glibc-2.11.1/malloc/malloc.c\n\nThis file is licenced under the LGPLv2.1\n'''\n\nimport re\n\nimport gdb\n\nfrom heap import WrappedPointer, WrappedValue, caching_lookup_type, \\\n    type_char_ptr, check_missing_debuginfo, array_length, offsetof\n\nclass MChunkPtr(WrappedPointer):\n    '''Wrapper around glibc's mchunkptr\n\n    Note:\n      as_address() gives the address of the chunk as seen by the malloc implementation\n      as_mem() gives the address as seen by the user of malloc'''\n\n    # size field is or'ed with PREV_INUSE when previous adjacent chunk in use\n    PREV_INUSE = 0x1\n\n    # /* extract inuse bit of previous chunk */\n    # #define prev_inuse(p)       ((p)->size & PREV_INUSE)\n\n\n    # size field is or'ed with IS_MMAPPED if the chunk was obtained with mmap()\n    IS_MMAPPED = 0x2\n\n    # /* check for mmap()'ed chunk */\n    # #define chunk_is_mmapped(p) ((p)->size & IS_MMAPPED)\n\n\n    # size field is or'ed with NON_MAIN_ARENA if the chunk was obtained\n    # from a non-main arena.  This is only set immediately before handing\n    # the chunk to the user, if necessary.\n    NON_MAIN_ARENA = 0x4\n\n    # /* check for chunk from non-main arena */\n    # #define chunk_non_main_arena(p) ((p)->size & NON_MAIN_ARENA)\n\n    SIZE_BITS = (PREV_INUSE|IS_MMAPPED|NON_MAIN_ARENA)\n\n    @classmethod\n    def gdb_type(cls):\n        # Deferred lookup of the \"mchunkptr\" type:\n        return caching_lookup_type('mchunkptr')\n\n    def size(self):\n        if not(hasattr(self, '_cached_size')):\n            self._cached_size = int(self.field('mchunk_size'))\n        return self._cached_size\n\n    def chunksize(self):\n        return self.size() & ~(self.SIZE_BITS)\n\n    def has_flag(self, flag):\n        return self.size() & flag\n\n    def has_PREV_INUSE(self):\n        return self.has_flag(self.PREV_INUSE)\n\n    def has_IS_MMAPPED(self):\n        return self.has_flag(self.IS_MMAPPED)\n\n    def has_NON_MAIN_ARENA(self):\n        return self.has_flag(self.NON_MAIN_ARENA)\n\n    def __str__(self):\n        result = ('<%s chunk=0x%x mem=0x%x'\n                  % (self.__class__.__name__,\n                     self.as_address(),\n                     self.as_mem()))\n        if self.has_PREV_INUSE():\n            result += ' PREV_INUSE'\n        else:\n            result += ' prev_size=%i' % self.field('mchunk_prev_size')\n        if self.has_NON_MAIN_ARENA():\n            result += ' NON_MAIN_ARENA'\n        if self.has_IS_MMAPPED():\n            result += ' IS_MMAPPED'\n        else:\n            if self.is_inuse():\n                result += ' inuse'\n            else:\n                result += ' free'\n        SIZE_SZ = caching_lookup_type('size_t').sizeof\n        result += ' chunksize=%i memsize=%i>' % (self.chunksize(),\n                                                 self.chunksize() - (2 * SIZE_SZ))\n        return result\n\n    def as_mem(self):\n        # Analog of chunk2mem: the address as seen by the program (e.g. malloc)\n        SIZE_SZ = caching_lookup_type('size_t').sizeof\n        return self.as_address() + (2 * SIZE_SZ)\n\n    def is_inuse(self):\n        # Is this chunk is use?\n        if self.has_IS_MMAPPED():\n            return True\n        # Analog of #define inuse(p)\n        #   ((((mchunkptr)(((char*)(p))+((p)->size & ~SIZE_BITS)))->size) & PREV_INUSE)\n        nc = self.next_chunk()\n        return nc.has_PREV_INUSE()\n\n    def next_chunk(self):\n        # Analog of:\n        #   #define next_chunk(p) ((mchunkptr)( ((char*)(p)) + ((p)->size & ~SIZE_BITS) ))\n        ptr = self._gdbval.cast(type_char_ptr)\n        cs = self.chunksize()\n        ptr += cs\n        ptr = ptr.cast(MChunkPtr.gdb_type())\n        #print 'next_chunk returning: 0x%x' % ptr\n        return MChunkPtr(ptr)\n\n    def prev_chunk(self):\n        # Analog of:\n        #   #define prev_chunk(p) ((mchunkptr)( ((char*)(p)) - ((p)->prev_size) ))\n        ptr = self._gdbval.cast(type_char_ptr)\n        ptr -= self.field('prev_size')\n        ptr = ptr.cast(MChunkPtr.gdb_type())\n        return MChunkPtr(ptr)\n\nclass MBinPtr(MChunkPtr):\n    # Wrapper around an \"mbinptr\"\n\n    @classmethod\n    def gdb_type(cls):\n        # Deferred lookup of the \"mbinptr\" type:\n        return caching_lookup_type('mbinptr')\n\n    def first(self):\n        return MChunkPtr(self.field('fd'))\n\n    def last(self):\n        return MChunkPtr(self.field('bk'))\n\nclass MFastBinPtr(MChunkPtr):\n    # Wrapped around a mfastbinptr\n    pass\n\nclass MallocState(WrappedValue):\n    # Wrapper around struct malloc_state, as defined in malloc.c\n\n    def fastbin(self, idx):\n        return MFastBinPtr(self.field('fastbinsY')[idx])\n\n    def bin_at(self, i):\n        # addressing -- note that bin_at(0) does not exist\n        #  (mbinptr) (((char *) &((m)->bins[((i) - 1) * 2]))\n        #\t     - offsetof (struct malloc_chunk, fd))\n\n        ptr = self.field('bins')[(i-1)*2]\n        #print '001', ptr\n        ptr = ptr.address\n        #print '002', ptr\n        ptr = ptr.cast(type_char_ptr)\n        #print '003', ptr\n        ptr -= offsetof('struct malloc_chunk', 'fd')\n        #print '004', ptr\n        ptr = ptr.cast(MBinPtr.gdb_type())\n        #print '005', ptr\n        return MBinPtr(ptr)\n\n    def iter_chunks(self):\n        '''Yield a sequence of MChunkPtr corresponding to all chunks of memory\n        in the heap (both used and free), in order of ascending address'''\n\n        for c in self.iter_mmap_chunks():\n            yield c\n\n        for c in self.iter_sbrk_chunks():\n            yield c\n\n    def iter_mmap_chunks(self):\n        for inf in gdb.inferiors():\n            for (start, end) in iter_mmap_heap_chunks(inf.pid):\n                # print \"Trying 0x%x-0x%x\" % (start, end)\n                try:\n                    chunk = MChunkPtr(gdb.Value(start).cast(MChunkPtr.gdb_type()))\n                    # Does this look like the first chunk within a range of\n                    # mmap address space?\n                    #print ('0x%x' % chunk.as_address() + chunk.chunksize())\n                    if (not chunk.has_NON_MAIN_ARENA() and chunk.has_IS_MMAPPED()\n                        and chunk.as_address() + chunk.chunksize() <= end):\n\n                        # Iterate upwards until you reach \"end\" of mmap space:\n                        while chunk.as_address() < end and chunk.has_IS_MMAPPED():\n                            yield chunk\n                            # print '0x%x' % chunk.as_address(), chunk\n                            chunk = chunk.next_chunk()\n                except RuntimeError:\n                    pass\n\n    def iter_sbrk_chunks(self):\n        '''Yield a sequence of MChunkPtr corresponding to all chunks of memory\n        in the heap (both used and free), in order of ascending address, for those\n        from sbrk_base upwards'''\n        # FIXME: this is currently a hack; I need to verify my logic here\n\n        # As I understand it, it's only possible to navigate the following ways:\n        #\n        # For a chunk with PREV_INUSE:0, then prev_size is valid, and can be used\n        # to substract down to the start of that chunk\n        # For a chunk with PREV_INUSE:1, then prev_size is not readable (reading it\n        # could lead to SIGSEGV), and it's not possible to get at the size of the\n        # previous chunk.\n\n        # For a free chunk, we have next/prev pointers to a doubly-linked list\n        # of other free chunks.\n\n        # For a chunk, we have the size, and that size gives us the address of the next chunk in RAM\n        # So if we know the address of the first chunk, then we can use this to iterate upwards through RAM,\n        # and thus iterate over all of the chunks\n\n        # Start at \"mp_.sbrk_base\"\n        chunk = MChunkPtr(gdb.Value(sbrk_base()).cast(MChunkPtr.gdb_type()))\n        # sbrk_base is NULL when no small allocations have happened:\n        if chunk.as_address() > 0:\n            # Iterate upwards until you reach \"top\":\n            top = int(self.field('top'))\n            while chunk.as_address() != top:\n                yield chunk\n                # print '0x%x' % chunk.as_address(), chunk\n                try:\n                    chunk = chunk.next_chunk()\n                except RuntimeError:\n                    break\n\n\n\n    def iter_free_chunks(self):\n        '''Yield a sequence of MChunkPtr (some of which may be MFastBinPtr),\n        corresponding to the free chunks of memory'''\n        # Account for top:\n        print('top')\n        yield MChunkPtr(self.field('top'))\n\n        NFASTBINS = self.NFASTBINS()\n        # Traverse fastbins:\n        for i in range(0, int(NFASTBINS)):\n            print('fastbin %i' % i)\n            p = self.fastbin(i)\n            while not p.is_null():\n                yield p\n                p = MFastBinPtr(p.field('fd'))\n\n        #   for (p = fastbin (av, i); p != 0; p = p->fd) {\n        #     ++nfastblocks;\n        #     fastavail += chunksize(p);\n        #   }\n        # }\n\n        # Must keep this in-sync with malloc.c:\n        # FIXME: can we determine this dynamically from within gdb?\n        NBINS = 128\n\n        # Traverse regular bins:\n        for i in range(1, NBINS):\n            print('regular bin %i' % i)\n            b = self.bin_at(i)\n            #print 'b: %s' % b\n            p = b.last()\n            n = 0\n            #print 'p:', p\n            while p.as_address() != b.as_address():\n                #print 'n:', n\n                #print 'b:', b\n                #print 'p:', p\n                n+=1\n                yield p\n                p = MChunkPtr(p.field('bk'))\n        #    for (p = last(b); p != b; p = p->bk) {\n        #        ++nblocks;\n        #          avail += chunksize(p);\n        #    }\n        # }\n\n    def NFASTBINS(self):\n        fastbinsY = self.field('fastbinsY')\n        return array_length(fastbinsY)\n\nclass MallocPar(WrappedValue):\n    # Wrapper around static struct malloc_par mp_\n    @classmethod\n    def get(cls):\n        # It's a singleton:\n        gdbval = gdb.parse_and_eval('mp_')\n        return MallocPar(gdbval)\n\ndef sbrk_base():\n    mp_ = MallocPar.get()\n    try:\n        return int(mp_.field('sbrk_base'))\n    except RuntimeError as e:\n        check_missing_debuginfo(e, 'glibc')\n        raise e\n\n\"\"\"\n\"\"\"\n\n\n# See malloc.c:\n#    struct mallinfo mALLINFo(mstate av)\n#    {\n#      struct mallinfo mi;\n#      size_t i;\n#      mbinptr b;\n#      mchunkptr p;\n#      INTERNAL_SIZE_T avail;\n#      INTERNAL_SIZE_T fastavail;\n#      int nblocks;\n#      int nfastblocks;\n#\n#      /* Ensure initialization */\n#      if (av->top == 0)  malloc_consolidate(av);\n#\n#      check_malloc_state(av);\n#\n#      /* Account for top */\n#      avail = chunksize(av->top);\n#      nblocks = 1;  /* top always exists */\n#\n#      /* traverse fastbins */\n#      nfastblocks = 0;\n#      fastavail = 0;\n#\n#      for (i = 0; i < NFASTBINS; ++i) {\n#        for (p = fastbin (av, i); p != 0; p = p->fd) {\n#          ++nfastblocks;\n#          fastavail += chunksize(p);\n#        }\n#      }\n#\n#      avail += fastavail;\n#\n#      /* traverse regular bins */\n#      for (i = 1; i < NBINS; ++i) {\n#        b = bin_at(av, i);\n#        for (p = last(b); p != b; p = p->bk) {\n#          ++nblocks;\n#          avail += chunksize(p);\n#        }\n#      }\n#\n#      mi.smblks = nfastblocks;\n#      mi.ordblks = nblocks;\n#      mi.fordblks = avail;\n#      mi.uordblks = av->system_mem - avail;\n#      mi.arena = av->system_mem;\n#      mi.hblks = mp_.n_mmaps;\n#      mi.hblkhd = mp_.mmapped_mem;\n#      mi.fsmblks = fastavail;\n#      mi.keepcost = chunksize(av->top);\n#      mi.usmblks = mp_.max_total_mem;\n#      return mi;\n#    }\n#\n\n\ndef iter_mmap_heap_chunks(pid):\n    '''Try to locate the memory-mapped heap allocations for the given\n    process (by PID) by reading /proc/PID/maps\n\n    Yield a sequence of (start, end) pairs'''\n    for line in open('/proc/%i/maps' % pid):\n        # print line,\n        # e.g.:\n        # 38e441e000-38e441f000 rw-p 0001e000 fd:01 1087                           /lib64/ld-2.11.1.so\n        # 38e441f000-38e4420000 rw-p 00000000 00:00 0\n        hexd = r'[0-9a-f]'\n        hexdigits = '(' + hexd + '+)'\n        m = re.match(hexdigits + '-' + hexdigits\n                     + r' ([r\\-][w\\-][x\\-][ps]) ' + hexdigits\n                     + r' (..:..) (\\d+)\\s+(.*)',\n                     line)\n        if m:\n            # print m.groups()\n            start, end, perms, offset, dev, inode, pathname = m.groups()\n            # PROT_READ, PROT_WRITE, MAP_PRIVATE:\n            if perms == 'rw-p':\n                if offset == '00000000': # FIXME bits?\n                    if dev == '00:00': # FIXME\n                        if inode == '0': # FIXME\n                            if pathname == '': # FIXME\n                                # print 'heap line?:', line\n                                # print m.groups()\n                                start, end = [int(m.group(i), 16) for i in (1, 2)]\n                                yield (start, end)\n        else:\n            print('unmatched :', line)\n\nclass GlibcArenas(object):\n    def __init__(self):\n        self.main_arena = self.get_main_arena()\n        self.cur_arena = self.get_ms(self.main_arena)\n        self.get_arenas()\n\n    def get_main_arena(self):\n        return gdb.parse_and_eval(\"main_arena\")\n\n    def get_ms(self, arena_dereference=None):\n        if arena_dereference:\n            ms = MallocState(arena_dereference)\n        else:\n            ms = self.cur_arena\n\n        return ms\n\n    def get_arenas(self):\n        ar_ptr = self.get_ms(self.main_arena)\n\n        self.arenas = []\n        while True:\n            self.arenas.append(ar_ptr)\n\n            if ar_ptr.address != ar_ptr.field('next'):\n                ar_ptr = self.get_ms(ar_ptr.field('next').dereference())\n\n            if ar_ptr.address == self.main_arena.address:\n                return\n\n\n\nglibc_arenas = GlibcArenas()\n"
  },
  {
    "path": "heap/gobject.py",
    "content": "# Copyright (C) 2010  David Hugh Malcolm\n#\n# This library is free software; you can redistribute it and/or\n# modify it under the terms of the GNU Lesser General Public\n# License as published by the Free Software Foundation; either\n# version 2.1 of the License, or (at your option) any later version.\n#\n# This library is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n# Lesser General Public License for more details.\n#\n# You should have received a copy of the GNU Lesser General Public\n# License along with this library; if not, write to the Free Software\n# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA\n\nimport re\nimport sys\n\nimport gdb\n\nfrom heap import WrappedPointer, WrappedValue, caching_lookup_type, type_char_ptr, Category\n\n# Use glib's pretty-printers:\ndir_ = '/usr/share/glib-2.0/gdb'\nif not dir_ in sys.path:\n    sys.path.insert(0, dir_)\nfrom glib_gdb import read_global_var, g_quark_to_string\n\n\n# This was adapted from glib's gobject.py:g_type_to_name\ndef get_typenode_for_gtype(gtype):\n    def lookup_fundamental_type(typenode):\n        if typenode == 0:\n            return None\n        val = read_global_var(\"static_fundamental_type_nodes\")\n        if val == None:\n            return None\n\n        # glib has an address() call here on the end, which looks wrong\n        # (i) it's an attribute, not a method\n        # (ii) it converts a TypeNode* to a TypeNode**\n        return val[typenode >> 2]\n\n    gtype = int(gtype)\n    typenode = gtype - gtype % 4\n    if typenode > (255 << 2):\n        return gdb.Value(typenode).cast (gdb.lookup_type(\"TypeNode\").pointer())\n    else:\n        return lookup_fundamental_type (typenode)\n\ndef is_typename_castable(typename):\n    if typename.startswith('Gtk'):\n        return True\n    if typename.startswith('Gdk'):\n        return True\n    if typename.startswith('GType'):\n        return True\n    if typename.startswith('Pango'):\n        return True\n    if typename.startswith('GVfs'):\n        return True\n    return False\n\nclass GTypeInstancePtr(WrappedPointer):\n    @classmethod\n    def from_gtypeinstance_ptr(cls, addr, typenode):\n        typename = cls.get_type_name(typenode)\n        if typename:\n            cls = cls.get_class_for_typename(typename)\n            return cls(addr, typenode, typename)\n\n    @classmethod\n    def get_class_for_typename(cls, typename):\n        '''Get the GTypeInstance subclass for the given type name'''\n        if typename in typemap:\n            return typemap[typename]\n        return GTypeInstancePtr\n\n    def __init__(self, addr, typenode, typename):\n        # Try to cast the ptr to the named type:\n        addr = gdb.Value(addr)\n        try:\n            if is_typename_castable(typename):\n                # This requires, say, gtk2-debuginfo:\n                ptr_type = caching_lookup_type(typename).pointer()\n                addr = addr.cast(ptr_type)\n                #print typename, addr.dereference()\n                #if typename == 'GdkPixbuf':\n                #    print 'GOT PIXELS', addr['pixels']\n        except RuntimeError as e:\n            pass\n            #print addr, e\n\n        WrappedPointer.__init__(self, addr)\n        self.typenode = typenode\n        self.typename = typename\n        \"\"\"\n        try:\n            print 'self', self\n            print 'self.typename', self.typename\n            print 'typenode', typenode\n            print 'typenode.type', typenode.type\n            print 'typenode.dereference()', typenode.dereference()\n            print\n        except:\n            print 'got here'\n            raise\n        \"\"\"\n\n    def categorize(self):\n        return Category('GType', self.typename, '')\n\n    @classmethod\n    def get_type_name(cls, typenode):\n        return g_quark_to_string(typenode[\"qname\"])\n\n\nclass GdkColormapPtr(GTypeInstancePtr):\n    def categorize_refs(self, usage_set, level=0, detail=None):\n        # print 'got here 46'\n        pass\n        # GdkRgbInfo is stored as qdata on a GdkColormap\n\nclass GdkImagePtr(GTypeInstancePtr):\n    def categorize_refs(self, usage_set, level=0, detail=None):\n        priv_type = caching_lookup_type('GdkImagePrivateX11').pointer()\n        priv_data = WrappedPointer(self._gdbval['windowing_data'].cast(priv_type))\n\n        usage_set.set_addr_category(priv_data.as_address(),\n                                    Category('GType', 'GdkImagePrivateX11', ''),\n                                    level=level+1, debug=True)\n\n        ximage = WrappedPointer(priv_data.field('ximage'))\n        dims = '%sw x %sh x %sbpp' % (ximage.field('width'),\n                                      ximage.field('height'),\n                                      ximage.field('depth'))\n        usage_set.set_addr_category(ximage.as_address(),\n                                    Category('X11', 'Image', dims),\n                                    level=level+2, debug=True)\n\n        usage_set.set_addr_category(int(ximage.field('data')),\n                                    Category('X11', 'Image data', dims),\n                                    level=level+2, debug=True)\n\nclass GdkPixbufPtr(GTypeInstancePtr):\n    def categorize_refs(self, usage_set, level=0, detail=None):\n        dims = '%sw x %sh' % (self._gdbval['width'],\n                              self._gdbval['height'])\n        usage_set.set_addr_category(int(self._gdbval['pixels']),\n                                    Category('GType', 'GdkPixbuf pixels', dims),\n                                    level=level+1, debug=True)\n\nclass PangoCairoFcFontMapPtr(GTypeInstancePtr):\n    def categorize_refs(self, usage_set, level=0, detail=None):\n        # This gives us access to the freetype library:\n        FT_Library = WrappedPointer(self._gdbval['library'])\n\n        # This is actually a \"struct  FT_LibraryRec_\", in FreeType's\n        #   include/freetype/internal/ftobjs.h\n        # print FT_Library._gdbval.dereference()\n\n        usage_set.set_addr_category(FT_Library.as_address(),\n                                    Category('FreeType', 'Library', ''),\n                                    level=level+1, debug=True)\n\n        usage_set.set_addr_category(int(FT_Library.field('raster_pool')),\n                                    Category('FreeType', 'raster_pool', ''),\n                                    level=level+2, debug=True)\n        # potentially we could look at FT_Library['memory']\n\n\ntypemap = {\n    'GdkColormap':GdkColormapPtr,\n    'GdkImage':GdkImagePtr,\n    'GdkPixbuf':GdkPixbufPtr,\n    'PangoCairoFcFontMap':PangoCairoFcFontMapPtr,\n}\n\n\n\ndef as_gtype_instance(addr, size):\n    #type_GObject_ptr = caching_lookup_type('GObject').pointer()\n    try:\n        type_GTypeInstance_ptr = caching_lookup_type('GTypeInstance').pointer()\n    except RuntimeError:\n        # Not linked against GLib?\n        return None\n\n    gobj = gdb.Value(addr).cast(type_GTypeInstance_ptr)\n    try:\n        gtype = gobj['g_class']['g_type']\n        #print 'gtype', gtype\n        typenode = get_typenode_for_gtype(gtype)\n        # If I remove the next line, we get errors like:\n        #   Cannot access memory at address 0xd1a712caa5b6e5c0\n        # Does this line give us an early chance to raise an exception?\n        #print 'typenode', typenode\n        # It appears to be in the coercion to boolean here:\n        # if typenode:\n        if typenode is not None:\n            #print 'typenode.dereference()', typenode.dereference()\n            return GTypeInstancePtr.from_gtypeinstance_ptr(addr, typenode)\n    except RuntimeError:\n        # Any random buffer that we point this at that isn't a GTypeInstance (or\n        # GObject) is likely to raise a RuntimeError at some point in the above\n        pass\n    return None\n\n# FIXME: currently this ignores G_SLICE\n# e.g. use\n#    G_SLICE=always-malloc\n# to override this\n"
  },
  {
    "path": "heap/history.py",
    "content": "# Copyright (C) 2010  David Hugh Malcolm\n#\n# This library is free software; you can redistribute it and/or\n# modify it under the terms of the GNU Lesser General Public\n# License as published by the Free Software Foundation; either\n# version 2.1 of the License, or (at your option) any later version.\n#\n# This library is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n# Lesser General Public License for more details.\n#\n# You should have received a copy of the GNU Lesser General Public\n# License along with this library; if not, write to the Free Software\n# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA\n\nimport datetime\nfrom heap import iter_usage_with_progress, fmt_size, fmt_addr, sign\n\nclass Snapshot(object):\n    '''Snapshot of the state of the heap'''\n    def __init__(self, name, time):\n        self.name = name\n        self.time = time\n        self._all_usage = set()\n        self._totalsize = 0\n        self._num_usage = 0\n\n    def _add_usage(self, u):\n        self._all_usage.add(u)\n        self._totalsize += u.size\n        self._num_usage += 1\n        return u\n\n    @classmethod\n    def current(cls, name):\n        result = cls(name, datetime.datetime.now())\n        for i, u in enumerate(iter_usage_with_progress()):\n            u.ensure_category()\n            u.ensure_hexdump()\n            result._add_usage(u)\n        return result\n\n    def total_size(self):\n        '''Get total allocated size, in bytes'''\n        return self._totalsize\n\n    def summary(self):\n        return '%s allocated, in %i blocks' % (fmt_size(self.total_size()), \n                                               self._num_usage)\n\n    def size_by_address(self, address):\n        return self._chunk_by_address[address].size\n\nclass History(object):\n    '''History of snapshots of the state of the heap'''\n    def __init__(self):\n        self.snapshots = []\n\n    def add(self, name):\n        s = Snapshot.current(name)\n        self.snapshots.append(s)\n        return s\n\nclass Diff(object):\n    '''Differences between two states of the heap'''\n    def __init__(self, old, new):\n        self.old = old\n        self.new = new\n\n        self.new_minus_old = self.new._all_usage - self.old._all_usage\n        self.old_minus_new = self.old._all_usage - self.new._all_usage\n\n    def stats(self):\n        size_change = self.new.total_size() - self.old.total_size()\n        count_change = self.new._num_usage - self.old._num_usage\n        return \"%s%s bytes, %s%s blocks\" % (sign(size_change),\n                                      fmt_size(size_change),\n                                      sign(count_change),\n                                      fmt_size(count_change))\n        \n    def as_changes(self):\n        result = self.chunk_report('Free-d blocks', self.old, self.old_minus_new)\n        result += self.chunk_report('New blocks', self.new, self.new_minus_old)\n        # FIXME: add changed chunks\n        return result\n\n    def chunk_report(self, title, snapshot, set_of_usage):\n        result = '%s:\\n' % title\n        if len(set_of_usage) == 0:\n            result += '  (none)\\n'\n            return result\n        for usage in sorted(set_of_usage,\n                            lambda u1, u2: cmp(u1.start, u2.start)):\n            result += ('  %s -> %s %8i bytes %20s |%s\\n'\n                       % (fmt_addr(usage.start),\n                          fmt_addr(usage.start + usage.size-1),\n                          usage.size, usage.category, usage.hd))\n        return result\n    \nhistory = History()\n\n"
  },
  {
    "path": "heap/parser.py",
    "content": "# Copyright (C) 2010  David Hugh Malcolm\n#\n# This library is free software; you can redistribute it and/or\n# modify it under the terms of the GNU Lesser General Public\n# License as published by the Free Software Foundation; either\n# version 2.1 of the License, or (at your option) any later version.\n#\n# This library is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n# Lesser General Public License for more details.\n#\n# You should have received a copy of the GNU Lesser General Public\n# License along with this library; if not, write to the Free Software\n# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA\n\n\n# Query language for the heap\n\n# Uses \"ply\", so we'll need python-ply on Fedora\n\n# Split into tokenizer, then grammar, then external interface\n\n############################################################################\n# Tokenizer:\n############################################################################\nimport ply.lex as lex\n\nreserved = ['AND', 'OR', 'NOT']\ntokens = [\n    'ID','LITERAL_NUMBER', 'LITERAL_STRING',\n    'LPAREN','RPAREN',\n    'COMPARISON'\n    ] + reserved\n\nt_LPAREN  = r'\\('\nt_RPAREN  = r'\\)'\n\ndef t_ID(t):\n    r'[a-zA-Z_][a-zA-Z_0-9]*'\n    # Check for reserved words (case insensitive):\n    if t.value.upper() in reserved:\n        t.type = t.value.upper()\n    else:\n        t.type = 'ID'\n    return t\n\ndef t_COMPARISON(t):\n    r'<=|<|==|=|!=|>=|>'\n    return t\n\ndef t_LITERAL_NUMBER(t):\n    r'(0x[0-9a-fA-F]+|\\d+)'\n    try:\n        if t.value.startswith('0x'):\n            t.value = int(t.value, 16)\n        else:\n            t.value = int(t.value)\n    except ValueError:\n        raise ParserError(t.value)\n    return t\n\ndef t_LITERAL_STRING(t):\n    r'\"([^\"]*)\"'\n    # Drop the quotes:\n    t.value = t.value[1:-1]\n    return t\n\n# Ignored characters\nt_ignore = \" \\t\"\n\ndef t_newline(t):\n    r'\\n+'\n    t.lexer.lineno += t.value.count(\"\\n\")\n\ndef t_error(t):\n    print(\"Illegal character '%s'\" % t.value[0])\n    t.lexer.skip(1)\n\nlexer = lex.lex()\n\n\n############################################################################\n# Grammar:\n############################################################################\nimport ply.yacc as yacc\n\nprecedence = (\n    ('left', 'AND', 'OR'),\n    ('left', 'NOT'),\n    ('left', 'COMPARISON'),\n)\n\nfrom heap.query import Constant, And, Or, Not, GetAttr, \\\n    Comparison__le__, Comparison__lt__, Comparison__eq__, \\\n    Comparison__ne__, Comparison__ge__, Comparison__gt__\n\n\ndef p_expression_number(t):\n    'expression : LITERAL_NUMBER'\n    t[0] = Constant(t[1])\n\ndef p_expression_string(t):\n    'expression : LITERAL_STRING'\n    t[0] = Constant(t[1])\n\ndef p_comparison(t):\n    'expression : expression COMPARISON expression'\n    classes = { '<=' : Comparison__le__,\n                '<'  : Comparison__lt__,\n                '==' : Comparison__eq__,\n                '='  : Comparison__eq__,\n                '!=' : Comparison__ne__,\n                '>=' : Comparison__ge__,\n                '>'  : Comparison__gt__ }\n    cls = classes[t[2]]\n\n    t[0] = cls(t[1], t[3])\n\ndef p_and(t):\n    'expression : expression AND expression'\n    t[0] = And(t[1], t[3])\n\ndef p_or(t):\n    'expression : expression OR expression'\n    t[0] = Or(t[1], t[3])\n\ndef p_not(t):\n    'expression : NOT expression'\n    t[0] = Not(t[2])\n\ndef p_expression_group(t):\n    'expression : LPAREN expression RPAREN'\n    t[0] = t[2]\n\ndef p_expression_name(t):\n    'expression : ID'\n    attrname = t[1]\n    attrnames = ('domain', 'kind', 'detail', 'addr', 'start', 'size')\n    if attrname not in attrnames:\n        raise ParserError.from_production(t, attrname,\n                                          ('Unknown attribute \"%s\" (supported are %s)'\n                                           % (attrname, ','.join(attrnames))))\n    t[0] = GetAttr(attrname)\n\nclass ParserError(Exception):\n    @classmethod\n    def from_production(cls, p, val, msg):\n        return ParserError(p.lexer.lexdata,\n                           p.lexer.lexpos - len(val),\n                           val,\n                           msg)\n\n    @classmethod\n    def from_token(cls, t, msg=\"Parse error\"):\n        return ParserError(t.lexer.lexdata,\n                           t.lexer.lexpos - len(t.value),\n                           t.value,\n                           msg)\n\n    def __init__(self, input_, pos, value, msg):\n        self.input_ = input_\n        self.pos = pos\n        self.value = value\n        self.msg = msg\n\n    def __str__(self):\n        return ('%s at \"%s\":\\n%s\\n%s'\n                % (self.msg, self.value,\n                   self.input_,\n                   ' '*self.pos + '^'*len(self.value)))\n\ndef p_error(t):\n    raise ParserError.from_token(t)\n\n\n############################################################################\n# Interface:\n############################################################################\n\n# Entry point:\ndef parse_query(s):\n    #try:\n    parser = yacc.yacc(debug=0, write_tables=0)\n    return parser.parse(s)#, debug=1)\n    #except ParserError, e:\n    #    print 'foo', e\n\ndef test_lexer(s):\n    lexer.input(s)\n    while True:\n        tok = lexer.token()\n        if not tok: break\n        print(tok)\n"
  },
  {
    "path": "heap/pypy.py",
    "content": "import gdb\nfrom heap import WrappedPointer, caching_lookup_type, Usage, \\\n    type_void_ptr, fmt_addr, Category, looks_like_ptr, \\\n    WrongInferiorProcess\n\ndef pypy_categorizer(addr, size):\n    return None\n\nclass ArenaCollection(WrappedPointer):\n\n    # Corresponds to pypy/rpython/memory/gc/minimarkpage.py:ArenaCollection\n\n    def get_arenas(self):\n        # Yield a sequence of (struct pypy_ArenaReference0*) gdb.Value instances\n        # representing the arenas\n        current_arena = self.field('ac_inst_current_arena')\n        # print \"self.field('ac_inst_current_arena'): %s\" % self.field('ac_inst_current_arena')\n        if current_arena:\n            yield ArenaReference(current_arena)\n        # print \"self.field('ac_inst_arenas_lists'):%s\" % self.field('ac_inst_arenas_lists')\n        #for arena in :\n        arena = self.field('ac_inst_arenas_lists')\n        #while arena:\n        #    yield ArenaReference(arena)\n        #    arena = arena.dereference()['ac_inst_nextarena']\n\nclass ArenaReference(WrappedPointer):\n    def iter_usage(self):\n        # print 'got PyPy arena within allocations'\n        return [] # FIXME\n\nclass ArenaDetection(object):\n    '''Detection of PyPy arenas, done as an object so that we can cache state'''\n    def __init__(self):\n        try:\n            ac_global = gdb.parse_and_eval('pypy_g_pypy_rpython_memory_gc_minimarkpage_ArenaCollect')\n        except RuntimeError:\n            # Not PyPy?\n            raise WrongInferiorProcess('pypy')\n        self._ac = ArenaCollection(ac_global.address)\n        self._arena_refs = []\n        self._malloc_ptrs = {}\n        for ar in self._ac.get_arenas():\n            print(ar)\n            print(ar._gdbval.dereference())\n            self._arena_refs.append(ar)\n            # ar_base : address as returned by malloc\n            self._malloc_ptrs[int(ar.field('ar_base'))] = ar\n        print(self._malloc_ptrs)\n\n    def as_arena(self, ptr, chunksize):\n        if ptr in self._malloc_ptrs:\n            return self._malloc_ptrs[ptr]\n        return None\n"
  },
  {
    "path": "heap/query.py",
    "content": "# Copyright (C) 2010  David Hugh Malcolm\n#\n# This library is free software; you can redistribute it and/or\n# modify it under the terms of the GNU Lesser General Public\n# License as published by the Free Software Foundation; either\n# version 2.1 of the License, or (at your option) any later version.\n#\n# This library is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n# Lesser General Public License for more details.\n#\n# You should have received a copy of the GNU Lesser General Public\n# License along with this library; if not, write to the Free Software\n# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA\n\nimport sys\n\nclass Expression(object):\n    def eval_(self, u):\n        raise NotImplementedError\n\n    def __eq__(self, other):\n        return (self.__class__ == other.__class__\n                and self.__dict__ == other.__dict__)\n\nclass Constant(Expression):\n    def __init__(self, value):\n        self.value = value\n\n    def __repr__(self):\n        return 'Constant(%r)' % (self.value,)\n\n    def eval_(self, u):\n        return self.value\n\nclass GetAttr(Expression):\n    def __init__(self, attrname):\n        self.attrname = attrname\n\n    def __repr__(self):\n        return 'GetAttr(%r)' % (self.attrname,)\n\n    def eval_(self, u):\n        if self.attrname in ('domain', 'kind', 'detail'):\n            if u.category == None:\n                u.ensure_category()\n            return getattr(u.category, self.attrname)\n        return getattr(u, self.attrname)\n\nclass BinaryOp(Expression):\n    def __init__(self, lhs, rhs):\n        self.lhs = lhs\n        self.rhs = rhs\n\nclass Comparison(BinaryOp):\n    def __init__(self, lhs, rhs):\n        BinaryOp.__init__(self, lhs, rhs)\n\n    def __repr__(self):\n        return '%s(%r, %r)' % (self.__class__.__name__, self.lhs, self.rhs)\n\n    def eval_(self, u):\n        lhs_val = self.lhs.eval_(u)\n        rhs_val = self.rhs.eval_(u)\n        return self.cmp_(lhs_val, rhs_val)\n\n    def cmp_(self, lhs, rhs):\n        raise NotImplementedError\n\nclass Comparison__le__(Comparison):\n    def cmp_(self, lhs, rhs):\n        return lhs <= rhs\n\nclass Comparison__lt__(Comparison):\n    def cmp_(self, lhs, rhs):\n        return lhs <  rhs\n\nclass Comparison__eq__(Comparison):\n    def cmp_(self, lhs, rhs):\n        return lhs == rhs\n\nclass Comparison__ne__(Comparison):\n    def cmp_(self, lhs, rhs):\n        return lhs != rhs\n\nclass Comparison__ge__(Comparison):\n    def cmp_(self, lhs, rhs):\n        return lhs >= rhs\n\nclass Comparison__gt__(Comparison):\n    def cmp_(self, lhs, rhs):\n        return lhs >  rhs\n\n\nclass And(BinaryOp):\n    def __repr__(self):\n        return 'And(%r, %r)' % (self.lhs, self.rhs)\n\n    def eval_(self, u):\n        # Short-circuit evaluation:\n        if not self.lhs.eval_(u):\n            return False\n        return self.rhs.eval_(u)\n\nclass Or(BinaryOp):\n    def __repr__(self):\n        return 'Or(%r, %r)' % (self.lhs, self.rhs)\n\n    def eval_(self, u):\n        # Short-circuit evaluation:\n        if self.lhs.eval_(u):\n            return True\n        return self.rhs.eval_(u)\n\nclass Not(Expression):\n    def __init__(self, inner):\n        self.inner = inner\n    def __repr__(self):\n        return 'Not(%r)' % (self.inner, )\n    def eval_(self, u):\n        return not self.inner.eval_(u)\n\n\n\nclass Column(object):\n    def __init__(self, name, getter, formatter):\n        self.name = name\n        self.getter = getter\n        self.formatter = formatter\n\n\nclass Query(object):\n    def __init__(self, filter_):\n        self.filter_ = filter_\n\n    def __iter__(self):\n        from heap import iter_usage_with_progress, lazily_get_usage_list\n\n        if True:\n            # 2-pass, but the expensive first pass may be cached\n            usage_list = lazily_get_usage_list()\n            for u in usage_list:\n                if self.filter_.eval_(u):\n                    yield u\n        else:\n            # 1-pass:\n            # This may miss blocks that are only categorized w.r.t. to other\n            # blocks:\n            for u in iter_usage_with_progress():\n                if self.filter_.eval_(u):\n                    yield u\n\ndef do_query(args):\n    from heap import fmt_addr, Table\n    from heap.parser import parse_query\n\n    if args == '':\n        # if no query supplied, select everything:\n        filter_ = Constant(True)\n    else:\n        filter_ = parse_query(args)\n\n    if False:\n        print(args)\n        print(filter_)\n\n    columns = [Column('Start',\n                      lambda u: u.start,\n                      fmt_addr),\n               Column('End',\n                      lambda u: u.start + u.size - 1,\n                      fmt_addr\n                      ),\n               Column('Domain',\n                      lambda u: u.category.domain,\n                      None),\n               Column('Kind',\n                      lambda u: u.category.kind,\n                      None),\n               Column('Detail',\n                      lambda u: u.category.detail,\n                      None),\n               Column('Hexdump',\n                      lambda u: u.hexdump,\n                      None),\n               ]\n\n    t = Table([col.name for col in columns])\n\n    for u in Query(filter_):\n        u.ensure_hexdump()\n        u.ensure_category()\n\n        if u.category:\n            domain = u.category.domain\n            kind = u.category.kind\n            detail = u.category.detail\n            if not detail:\n                detail = ''\n        else:\n            domain = ''\n            kind = ''\n            detail = ''\n\n        t.add_row([fmt_addr(u.start),\n                   fmt_addr(u.start + u.size - 1),\n                   domain,\n                   kind,\n                   detail,\n                   u.hd])\n\n    t.write(sys.stdout)\n    print()\n"
  },
  {
    "path": "heap/sqlite.py",
    "content": "# Copyright (C) 2010  David Hugh Malcolm\n#\n# This library is free software; you can redistribute it and/or\n# modify it under the terms of the GNU Lesser General Public\n# License as published by the Free Software Foundation; either\n# version 2.1 of the License, or (at your option) any later version.\n#\n# This library is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n# Lesser General Public License for more details.\n#\n# You should have received a copy of the GNU Lesser General Public\n# License along with this library; if not, write to the Free Software\n# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA\n\nfrom heap import Category, caching_lookup_type\n\nimport gdb\n\ndef categorize_sqlite3(addr, usage_set, visited):\n    # \"struct sqlite3\" is defined in src/sqliteInt.h, which is an internal header\n    ptr_type = caching_lookup_type('sqlite3').pointer()\n    obj_ptr = gdb.Value(addr).cast(ptr_type)\n    # print obj_ptr.dereference()\n\n    aDb = obj_ptr['aDb']\n    Db_addr = int(aDb)\n    Db_malloc_addr = Db_addr - 8\n    if usage_set.set_addr_category(Db_malloc_addr, Category('sqlite3', 'struct Db', None), visited):\n        print(aDb['pBt'].dereference())\n        # FIXME\n"
  },
  {
    "path": "make-release.sh",
    "content": "# Utility to help dmalcolm make releases:\nVERSION=$1\ngit clone git://git.fedorahosted.org/gdb-heap.git\n\npushd gdb-heap\ngit tag -a -m \"$VERSION\" $VERSION\n# FIXME: pushing this isn't working for some reason\npopd\n\nmv gdb-heap gdb-heap-${VERSION}\ntar cfvj gdb-heap-${VERSION}.tar.bz2 gdb-heap-${VERSION}\nscp gdb-heap-${VERSION}.tar.bz2 dmalcolm@fedorahosted.org:gdb-heap\nrm -rf gdb-heap-${VERSION}\n"
  },
  {
    "path": "object-sizes.py",
    "content": "# Copyright (C) 2010  David Hugh Malcolm\n#\n# This library is free software; you can redistribute it and/or\n# modify it under the terms of the GNU Lesser General Public\n# License as published by the Free Software Foundation; either\n# version 2.1 of the License, or (at your option) any later version.\n#\n# This library is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n# Lesser General Public License for more details.\n#\n# You should have received a copy of the GNU Lesser General Public\n# License along with this library; if not, write to the Free Software\n# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA\n\n\n# This is a support script for selftest.py\n\n# It creates various kinds of object, so that we can verify that gdb-heap\n# detects them (and their supporting buffers)\n\n\n# Four different kinds of (x, y) coordinate:\n\ntry:\n    from collections import namedtuple\n    NamedTuple = namedtuple('NamedTuple', ('x', 'y'))\nexcept ImportError:\n    NamedTuple = None\n\nclass OldStyle:\n    def __init__(self, x, y):\n        self.x = x\n        self.y = y\n\nclass NewStyle(object):\n    def __init__(self, x, y):\n        self.x = x\n        self.y = y\n\nclass NewStyleWithSlots(object):\n    __slots__ = ('x', 'y')\n    def __init__(self, x, y):\n        self.x = x\n        self.y = y\n\nobjs = []\ntypes = [OldStyle, NewStyle, NewStyleWithSlots]\nif NamedTuple:\n    types.append(NamedTuple)\nfor impl in types:\n    objs.append(impl(x=3, y=4))\nprint(objs)\n\n\n# Test creating an object with more than 8 attributes, so that the __dict__\n# has an external PyDictEntry buffer.\n# We will test to see if this detectable in the selftest.\nclass OldStyleManyAttribs:\n    def __init__(self, **kwargs):\n        self.__dict__ = kwargs\n\nclass NewStyleManyAttribs(object):\n    def __init__(self, **kwargs):\n        self.__dict__ = kwargs\n\n\n# Create instance with 9 attributes:\nold_style_many = OldStyleManyAttribs(**dict(zip('abcdefghi', range(9))))\nnew_style_many = NewStyleManyAttribs(**dict(zip('abcdefghi', range(9))))\n\n\n\n# Ensure that we have a set object that uses an externally allocated\n# buffer, so that we can verify that these are detected.  To do this,\n# we need a set with more than PySet_MINSIZE members (which is 8):\nlarge_set = set(range(64))\nlarge_frozenset = frozenset(range(64))\n\nimport sqlite3\ndb = sqlite3.connect(':memory:')\nc = db.cursor()\n\n# Create table\nc.execute('''CREATE TABLE dummy(foo TEXT, bar TEXT, v REAL)''')\n\n# Insert a row of data\nc.execute(\"INSERT INTO dummy VALUES ('ostrich', 'elephant', 42.0)\")\n\n# Save (commit) the changes\ndb.commit()\n\n# Don't close \"c\"; we want to see the objects in memory\n\n\n# Ensure that the selftest's breakpoint on builtin_id is hit:\nid(42)\n\n"
  },
  {
    "path": "resultparser.py",
    "content": "# Classes for working with the textual table output from gdb-heap\n\nimport unittest\nimport re\nfrom collections import namedtuple\n\ndef indent(str_):\n    return '\\n'.join([(' ' * 4) + line\n                      for line in str_.splitlines()])\n\nclass ColumnNotFound(Exception):\n    def __init__(self, colname, table):\n        self.colname = colname\n        self.table = table\n\n    def __str__(self):\n        return ('ColumnNotFound(%s) in:\\n%s'\n                % (self.colname, indent(str(self.table))))\n\nclass RowNotFound(Exception):\n    def __init__(self, criteria, table):\n        self.criteria = criteria\n        self.table = table\n    def __str__(self):\n        return ('RowNotFound(%s) in:\\n%s'\n                % (self.criteria, indent(str(self.table))))\n\nclass Criteria(object):\n    '''A list of (colname, value) criteria for searching rows in a table'''\n    def __init__(self, table, kvs):\n        self.kvs = kvs\n        self._by_index = [(table.find_col(attrname), value)\n                          for attrname, value in kvs]\n\n    def __str__(self):\n        return 'Criteria(%s)' % ','.join('%r=%r' % (attrname, value)\n                                         for attrname, value in self.kvs)\n\n    def is_matched_by(self, row):\n        for colindex, value in self._by_index:\n            if row[colindex] != value:\n                return False\n        return True\n\n\nclass ParsedTable(object):\n    '''Parses output from heap.Table, for use in writing selftests'''\n    @classmethod\n    def parse_lines(cls, data):\n        '''Parse the lines in the string, returning a list of ParsedTable\n        instances'''\n        result = []\n        lines = data.splitlines()\n        start = 0\n        while start < len(lines):\n            sep_line = cls._find_separator_line(lines[start:])\n            if sep_line:\n                sep_index, colmetrics = sep_line\n                t = ParsedTable(sep_index, colmetrics, lines[start:])\n                result.append(t)\n                start += t.sep_index + 1 + len(t.rows)\n            else:\n                break\n        return result\n\n    # Column metrics:\n    ColMetric = namedtuple('ColMetric', ('offset', 'width'))\n        \n    def __init__(self, sep_index, colmetrics, lines):\n        self.sep_index, self.colmetrics = sep_index, colmetrics\n\n        # Parse column headings:\n        header_index = self.sep_index - 1\n        self.colnames = self._split_cells(lines[header_index])\n\n        # Parse rows:\n        self.rows = []\n        for line in lines[self.sep_index + 1:]:\n            if line == '':\n                break\n            self.rows.append(self._split_cells(line))\n\n        self.rawdata = '\\n'.join(lines[header_index:header_index+len(self.rows)+2])\n\n    def __str__(self):\n        return self.rawdata\n\n    def as_rst_grid_table(self):\n\n        def _get_separator_row(colwidths, sepchar):\n            return '+' + ('+'.join([sepchar * width for width in colwidths])) + '+\\n'\n\n        def _get_row(values, colwidths):\n            row = '|'\n            cells = []\n            for value, width in zip(values, colwidths):\n                if value is None:\n                    cells.append(' ' * width)\n                else:\n                    formatString = \"%%%ds\" % width # to generate e.g. \"%20s\" \n                    cells.append(formatString % value)\n            row += '|'.join([cell for cell in cells])\n            row += '|\\n'\n            return row\n            \n        colwidths = [colmetric.width for colmetric in self.colmetrics]\n\n        result = _get_separator_row(colwidths, '-')\n        result += _get_row(self.colnames, colwidths)\n        result += _get_separator_row(colwidths, '=')\n        for row in self.rows:\n            result += _get_row(row, colwidths)\n            result += _get_separator_row(colwidths, '-')\n\n        return result\n\n    def get_cell(self, x, y):\n        return self.rows[y][x]\n\n    def find_col(self, colname):\n        # Find the index of the column with the given name\n        for x, col in enumerate(self.colnames):\n            if colname == col:\n                return x\n        raise ColumnNotFound(colname, self)\n\n    def find_row(self, kvs):\n        # Find the first row matching the criteria, or raise RowNotFound\n        criteria = Criteria(self, kvs)\n        for row in self.rows:\n            if criteria.is_matched_by(row):\n                return row\n        raise RowNotFound(criteria, self)\n\n    def find_cell(self, kvs, attr2name):\n        criteria = Criteria(self, kvs)\n        row = self.find_row(kvs)\n        return row[self.find_col(attr2name)]\n\n    def _get_cell_value(self, cellstr):\n        if cellstr == '':\n            return None\n\n        # Remove ',' separators from numbers, and treat as decimal:\n        m = re.match('^([0-9,]+)$', cellstr) # [0-9]\\,\n        if m:\n            return int(cellstr.replace(',', ''))\n\n        # Hexadecimal values:\n        m = re.match('^(0x[0-9a-f]+)$', cellstr)\n        if m:\n            return int(cellstr, 16)\n\n        # Keep as a str:\n        return cellstr\n\n    def _split_cells(self, line):\n        row = []\n        for col in self.colmetrics:\n            cellstr = line[col.offset: col.offset+col.width].lstrip()\n            cellvalue = self._get_cell_value(cellstr)\n            row.append(cellvalue)\n        return tuple(row)\n\n    @classmethod\n    def _find_separator_line(cls, lines):\n        # Look for the separator line\n        # Return (index, tuple of ColMetric)\n        for i, line in enumerate(lines):\n            if line.startswith('-'):\n                widths = [len(frag) for frag in line.split('  ')]\n                coldata = []\n                offset = 0\n                for width in widths:\n                    coldata.append(cls.ColMetric(offset=offset, width=width))\n                    offset += width + 2\n                return (i, tuple(coldata))\n            \n\n# Test data for table parsing (edited fragment of output during development):\ntest_table = '''\njunk line\n\n       Domain        Kind                 Detail  Count  Allocated size\n-------------  ----------  ---------------------  -----  --------------\n       python         str                         3,891         234,936\nuncategorized                        98312 bytes      1          98,312\nuncategorized                         1544 bytes     43          66,392\nuncategorized                         6152 bytes     10          61,520\n       python       tuple                         1,421          54,168\n                                                             0xdeadbeef\n                                           TOTAL  9,377         857,592\n\nanother junk line\n\nanother table\n\nChunk size  Num chunks  Allocated size\n----------  ----------  --------------\n        16         100           1,600\n        24          50           1,200\n    TOTALS         150           2,800\n\nmore junk\n'''\n\nclass ParserTests(unittest.TestCase):\n    def test_table_data(self):\n        tables = ParsedTable.parse_lines(test_table)\n        self.assertEquals(len(tables), 2)\n        pt = tables[0]\n\n        # Verify column names:\n        self.assertEquals(pt.colnames, ('Domain', 'Kind', 'Detail', 'Count', 'Allocated size'))\n\n        # Verify (x,y) lookup, and type conversions:\n        self.assertEquals(pt.get_cell(0, 0), 'python')\n        self.assertEquals(pt.get_cell(1, 3), None)\n        self.assertEquals(pt.get_cell(4, 5), 0xdeadbeef)\n        self.assertEquals(pt.get_cell(4, 6), 857592)\n\n        # Verify searching by value:\n        self.assertEquals(pt.find_col('Count'), 3)\n        self.assertEquals(pt.find_row([('Allocated size', 54168),]),\n                          ('python', 'tuple', None, 1421, 54168))\n        self.assertEquals(pt.find_cell([('Kind', 'str'),], 'Count'), 3891)\n\n        # Error-checking:\n        self.assertRaises(ColumnNotFound,\n                          pt.find_col, 'Ensure that a non-existant column raises an error')\n        self.assertRaises(RowNotFound,\n                          pt.find_row, [('Count', -1)])\n\n        # Verify that \"rawdata\" contains the correct string data:\n        self.assert_(pt.rawdata.startswith('       Domain'))\n        self.assert_(pt.rawdata.endswith('857,592'))\n\n        # Test the second table:\n        pt = tables[1]\n        self.assertEquals(pt.colnames, ('Chunk size', 'Num chunks', 'Allocated size'))\n        self.assertEquals(pt.get_cell(2, 2), 2800)\n        self.assert_(pt.rawdata.startswith('Chunk size'))\n        self.assert_(pt.rawdata.endswith('2,800'))\n\n\n    def test_multiple_tables(self):\n        tables = ParsedTable.parse_lines(test_table * 5)\n        self.assertEquals(len(tables), 10)\n\n    def test_rst(self):\n        tables = ParsedTable.parse_lines(test_table)\n        self.assertEquals(len(tables), 2)\n        pt = tables[0]\n        \n        rst_text = pt.as_rst_grid_table()\n        \n        exp = (\n            '+-------------+----------+---------------------+-----+--------------+\\n'\n            '|       Domain|      Kind|               Detail|Count|Allocated size|\\n'\n            '+=============+==========+=====================+=====+==============+\\n'\n            '|       python|       str|                     | 3891|        234936|\\n'\n            '+-------------+----------+---------------------+-----+--------------+\\n'\n            '|uncategorized|          |          98312 bytes|    1|         98312|\\n'\n            '+-------------+----------+---------------------+-----+--------------+\\n'\n            '|uncategorized|          |           1544 bytes|   43|         66392|\\n'\n            '+-------------+----------+---------------------+-----+--------------+\\n'\n            '|uncategorized|          |           6152 bytes|   10|         61520|\\n'\n            '+-------------+----------+---------------------+-----+--------------+\\n'\n            '|       python|     tuple|                     | 1421|         54168|\\n'\n            '+-------------+----------+---------------------+-----+--------------+\\n'\n            '|             |          |                     |     |    3735928559|\\n'\n            '+-------------+----------+---------------------+-----+--------------+\\n'\n            '|             |          |                TOTAL| 9377|        857592|\\n'\n            '+-------------+----------+---------------------+-----+--------------+\\n')\n        \n        self.assertEquals(rst_text, exp)\n\nif __name__ == \"__main__\":\n    unittest.main()\n"
  },
  {
    "path": "run-gdb-heap",
    "content": "#!/bin/bash\n# Handy script for launching a program under gdb, whilst wiring up gdb to use\n# the working copy of gdb-heap\n# Typical usage:\n#   ./run-gdb-heap python\nPYTHONPATH=\"$(pwd)\" \\\n  gdb \\\n  --eval-command=\"python import gdbheap\" \\\n  --args $*\n"
  },
  {
    "path": "selftest.py",
    "content": "# Copyright (C) 2010  David Hugh Malcolm\n#\n# This library is free software; you can redistribute it and/or\n# modify it under the terms of the GNU Lesser General Public\n# License as published by the Free Software Foundation; either\n# version 2.1 of the License, or (at your option) any later version.\n#\n# This library is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n# Lesser General Public License for more details.\n#\n# You should have received a copy of the GNU Lesser General Public\n# License along with this library; if not, write to the Free Software\n# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA\n\n# Verify that gdb can print information on the heap of an inferior process\n#\n# Adapted from Python's Lib/test/test_gdb.py, which in turn was adapted from\n# similar work in Unladen Swallow's Lib/test/test_jit_gdb.py\n\nimport os\nimport re\nfrom subprocess import Popen, PIPE, call as subprocess_call\nimport sys\nimport unittest\nimport random\nfrom test.test_support import run_unittest, findfile\n\nif sys.maxint == 0x7fffffff:\n    _32bit = True\nelse:\n    _32bit = False\n\ntry:\n    gdb_version, _ = Popen([\"gdb\", \"--version\"],\n                           stdout=PIPE).communicate()\nexcept OSError:\n    # This is what \"no gdb\" looks like.  There may, however, be other\n    # errors that manifest this way too.\n    raise unittest.SkipTest(\"Couldn't find gdb on the path\")\ngdb_version_number = re.search(r\"^GNU gdb [^\\d]*(\\d+)\\.\", gdb_version)\nif int(gdb_version_number.group(1)) < 7:\n    raise unittest.SkipTest(\"gdb versions before 7.0 didn't support python embedding\"\n                            \" Saw:\\n\" + gdb_version)\n\n# Verify that \"gdb\" was built with the embedded python support enabled:\ncmd = \"--eval-command=python import sys; print sys.version_info\"\np = Popen([\"gdb\", \"--batch\", cmd], stdout=PIPE)\ngdbpy_version, _ = p.communicate()\nif gdbpy_version == '':\n    raise unittest.SkipTest(\"gdb not built with embedded python support\")\n\nclass TestSource(object):\n    '''Programatically construct C source code for a test program that calls into the heap'''\n    def __init__(self):\n        self.decls = ''\n        self.operations = ''\n        self.num_ptrs = 0\n        self.indent = '    '\n\n    def add_line(self, code):\n        self.operations += self.indent + code + '\\n'\n\n    def add_malloc(self, size, debug=False, typename=None):\n        self.num_ptrs += 1\n        varname = 'ptr%03i'% self.num_ptrs\n\n        if typename:\n            cast = '(%s)' % typename\n        else:\n            typename = 'void *'\n            cast = ''\n\n        self.add_line('%s%s = %smalloc(0x%x); /* %i */'\n                      % (typename, varname, cast, size, size))\n        if debug:\n            self.add_line('printf(__FILE__ \":%%i:%s=%%p\\\\n\", __LINE__, %s);'\n                          % (varname, varname))\n            self.add_line('fflush(stdout);')\n        return varname\n\n    def add_realloc(self, varname, size, debug=False):\n        self.num_ptrs += 1\n        new_varname = 'ptr%03i'% self.num_ptrs\n        self.add_line('void *%s = realloc(%s, 0x%x);'\n                      % (new_varname, varname, size))\n        if debug:\n            self.add_line('printf(__FILE__ \":%%i:%s=%%p\\\\n\", __LINE__, %s);'\n                          % (new_varname, new_varname))\n            self.add_line('fflush(stdout);')\n        return new_varname\n\n    def add_free(self, varname, debug=False):\n        self.add_line('free(%s);' % varname)\n\n    def add_breakpoint(self):\n        self.add_line('__asm__ __volatile__ (\"int $03\");')\n\n    def as_c_source(self):\n        result = '''\n#include <stdio.h>\n#include <stdlib.h>\n'''\n        result += self.decls\n        result += '''\nint\nmain (int argc, char **argv)\n{\n''' + self.operations + '''\n    return 0;\n}\n'''\n        return result\n        \n\nclass TestProgram(object):\n    def __init__(self, name, source, is_cplusplus=False):\n        self.name = name\n        self.source = source\n\n        if is_cplusplus:\n            self.srcname = '%s.cc' % self.name\n            compiler = 'g++'\n        else:\n            self.srcname = '%s.c' % self.name\n            compiler = 'gcc'\n\n        f = open(self.srcname, 'w')\n        f.write(source)\n        f.close()\n        \n        c = subprocess_call([compiler,\n\n                             # We want debug information:\n                             '-g', \n                             \n                             # Name of the binary:\n                             '-o', self.name,\n\n                             # The source file:\n                             self.srcname]) \n        # Check exit status:\n        assert(c == 0)\n        \n        # Check that the binary exists:\n        assert(os.path.exists(self.name))\n\nfrom resultparser import ParsedTable, RowNotFound, test_table\n\nclass DebuggerTests(unittest.TestCase):\n\n    \"\"\"Test that the debugger can debug the heap\"\"\"\n\n    def run_gdb(self, *args):\n        \"\"\"Runs gdb with the command line given by *args.\n\n        Returns its stdout, stderr\n        \"\"\"\n        out, err = Popen(args, stdout=PIPE, stderr=PIPE).communicate()\n        return out, err\n\n\n    def requires_binary(self, binary):\n        # Slightly complicated: gdb will look for the binary within the PWD\n        # as well as within the $PATH\n\n        if os.path.exists(binary):\n            # It's either an absolute or relative path, and directly exists:\n            return\n\n        p = Popen(['which', binary], stdout=PIPE, stderr=PIPE)\n        out, err = p.communicate()\n        if p.returncode == 0:\n            # It's in the $PATH\n            return\n\n        raise unittest.SkipTest(\"%s not found\" % binary)\n\n    def command_test(self, progargs, commands, breakpoint=None):\n\n        self.requires_binary(progargs[0])\n\n        # Run under gdb, hit the breakpoint, then run our \"heap\" command:\n        commands =  [\n            'python sys.path.append(\".\") ; import gdbheap'\n            ] + commands\n        args = [\"gdb\", \"--batch\"]\n        args += ['--eval-command=%s' % cmd for cmd in commands]\n        args += [\"--args\"] + progargs\n\n        # print args\n        # print ' '.join(args)\n\n        # Use \"args\" to invoke gdb, capturing stdout, stderr:\n        out, err = self.run_gdb(*args)\n\n        # Ignore some noise on stderr due to a pending breakpoint:\n        if breakpoint:\n            err = err.replace('Function \"%s\" not defined.\\n' % breakpoint, '')\n\n        # Ensure no unexpected error messages:\n        if err != '':\n            print out\n            print err\n            self.fail('stderr from gdb was non-empty: %r' % err)\n\n        return out        \n\n    def program_test(self, name, source, commands, is_cplusplus=False):\n        p = TestProgram(name, source, is_cplusplus)\n        return self.command_test([p.name], commands)\n\n    def test_no_allocations(self):\n        # Verify handling of an inferior process that doesn't use the heap\n        src = TestSource()\n        src.add_breakpoint()\n        source = src.as_c_source()\n\n        out = self.program_test('test_no_allocations', source, commands=['run',  'heap sizes'])\n        self.assert_('''\nChunk size  Num chunks  Allocated size\n----------  ----------  --------------\n    TOTALS           0               0\n''' in out)\n\n    def test_small_allocations(self):\n        src = TestSource()\n        # 100 allocations each of sizes in the range 1-15\n        for i in range(100):\n            for size in range(1, 16):\n                src.add_malloc(size)\n        src.add_breakpoint()\n        source = src.as_c_source()\n\n        out = self.program_test('test_small_allocations', source, commands=['run',  'heap sizes'])\n\n        if _32bit:\n            exp = '''\nChunk size  Num chunks  Allocated size\n----------  ----------  --------------\n        16        1200          19,200\n        24         300           7,200\n    TOTALS        1500          26,400\n'''\n        else:\n            exp = '''\nChunk size  Num chunks  Allocated size\n----------  ----------  --------------\n        32        1500          48,000\n    TOTALS        1500          48,000\n'''\n        self.assert_(exp in out, out)\n\n\n    def test_large_allocations(self):\n        # 10 allocations each of sizes in the range 1MB through 10MB:\n        src = TestSource()\n        for i in range(10):\n            size = 1024 * 1024 * (i+1)\n            src.add_malloc(size)\n        src.add_breakpoint()\n        source = src.as_c_source()\n\n        out = self.program_test('test_large_allocations', source, commands=['run',  'heap sizes'])\n        self.assert_('''\nChunk size  Num chunks  Allocated size\n----------  ----------  --------------\n10,489,856           1      10,489,856\n 9,441,280           1       9,441,280\n 8,392,704           1       8,392,704\n 7,344,128           1       7,344,128\n 6,295,552           1       6,295,552\n 5,246,976           1       5,246,976\n 4,198,400           1       4,198,400\n 3,149,824           1       3,149,824\n 2,101,248           1       2,101,248\n 1,052,672           1       1,052,672\n    TOTALS          10      57,712,640\n''' in out)\n\n    def test_mixed_allocations(self):\n        # Compile test program\n        source = '''\n#include <stdio.h>\n#include <stdlib.h>\n\nint\nmain (int argc, char **argv)\n{\n    int i;\n    void *ptrs[100];\n    /* Some small allocations: */\n    for (i=0; i < 100; i++) {\n        ptrs[i] = malloc(256);\n        printf(\"malloc returned %p\\\\n\", ptrs[i]);\n        fflush(stdout);\n    }\n\n    /* Free one of the small allocations: */\n    free(ptrs[50]);\n\n    void* ptr1 = malloc(1000);\n    void* ptr2 = malloc(1000);\n    void* ptr3 = malloc(256000); /* large allocation */\n\n    /* Directly insert a breakpoint: */\n    __asm__ __volatile__ (\"int $03\");\n\n    return 0;\n}\n'''\n\n        out = self.program_test('test_simple', source, commands=['run',  'heap sizes'])\n        #print out\n\n        # Verify the result\n        if _32bit:\n            exp = '''\nChunk size  Num chunks  Allocated size\n----------  ----------  --------------\n   258,048           1         258,048\n       264          99          26,136\n     1,008           2           2,016\n    TOTALS         102         286,200\n'''\n        else:\n            exp = '''\nChunk size  Num chunks  Allocated size\n----------  ----------  --------------\n   258,048           1         258,048\n       272          99          26,928\n     1,008           2           2,016\n    TOTALS         102         286,992\n'''\n        self.assert_(exp in out, out)\n\n\n    def random_size(self):\n        size = random.randint(1, 64)\n        if random.randint(0, 5) == 0:\n            size *= 1024\n            size += random.randint(0, 1023)\n        if random.randint(0, 5) == 0:\n            size *= 256\n            size += random.randint(0, 255)\n        return size\n\n    def test_random_allocations(self):\n        # Fuzz-testing: lots of allocations (of various sizes)\n        # and deallocations\n        src = TestSource()\n        sizes = {}\n        live_blocks = set()\n        for i in range(100):\n            action = random.randint(1, 100)\n\n            # 70% chance of malloc:\n            if action <= 70:\n                size = self.random_size()\n                varname = src.add_malloc(size, debug=True)\n                sizes[varname] = size\n                live_blocks.add(varname)\n            if len(live_blocks) > 0:\n                # 10% chance of realloc:\n                if action in range(71, 80):\n                    size = self.random_size()\n                    old_varname = random.sample(live_blocks, 1)[0]\n                    live_blocks.remove(old_varname)\n                    new_varname = src.add_realloc(old_varname, size, debug=True)\n                    sizes[new_varname] = size\n                    live_blocks.add(new_varname)\n                # 20% chance of freeing something:\n                elif action > 80:\n                    varname = random.sample(live_blocks, 1)[0]\n                    live_blocks.remove(varname)\n                    src.add_free(varname)\n            src.add_breakpoint()\n\n        source = src.as_c_source()\n\n        out = self.program_test('test_random_allocations', source,\n                                commands=(['run']\n                                          + ['heap select', 'cont'] * 100))\n\n        # We have 100 states of the inferior process; check that each was\n        # reported as we expected it to be:\n        tables = ParsedTable.parse_lines(out)\n        self.assertEqual(len(tables), 100)\n        for i in range(100):\n            heap_select_out = tables[i]\n            #print heap_select_out\n            reported_addrs = set([heap_select_out.get_cell(0, y)\n                                  for y in range(len(heap_select_out.rows))])\n            #print reported_addrs\n\n        # FIXME: do some verification at each breakpoint: check that the\n        # reported values correspond to what we expect\n\n    def test_random_buffers(self):\n        # Fuzz-testing: try to break the heuristics by throwing random bytes\n        # at them.  Note that we do the randomization at the python level when\n        # generating the C code, so that the result of running any given C code\n        # is entirely reproducable\n        src = TestSource()\n        for i in range(100):\n            varname = src.add_malloc(256, typename='unsigned char*')\n            for offset in range(256):\n                value = random.randint(0, 255)\n                src.add_line('%s[%i]=0x%02x;' % (varname, offset, value))\n        src.add_breakpoint()\n        source = src.as_c_source()\n        out = self.program_test('test_random_buffers', source, commands=['run',  'heap'])\n        # print out\n\n\n    def test_cplusplus(self):\n        '''Verify that we can detect and categorize instances of C++ classes'''\n        # Note that C++ detection is currently disabled due to a bug in execution capture\n        src = TestSource()\n        src.decls += '''\nclass Foo {\npublic:\n    virtual ~Foo() {}\n    int f1;\n    int f2;\n};\nclass Bar : Foo {\npublic:\n    virtual ~Bar() {}\n    int f1;\n    // Ensure that Bar has a different allocated size to Foo, on every arch:\n    int buffer[256];\n};\n'''\n        for i in range(100):\n            src.add_line('{Foo *f = new Foo();}')\n            if i % 2:\n                src.add_line('{Bar *b = new Bar();}')\n        src.add_breakpoint()\n        source = src.as_c_source()\n\n        out = self.program_test('test_cplusplus', source, is_cplusplus=True, commands=['run',  'heap sizes', 'heap'])\n        tables = ParsedTable.parse_lines(out)\n        heap_sizes_out = tables[0]\n        heap_out = tables[1]\n\n        # We ought to have 150 live blocks on the heap:\n        self.assertHasRow(heap_out,\n                          [('Detail', 'TOTAL'), ('Count', 150)])\n\n        # Use the differing counts of the blocks to locate the objects\n        # FIXME: change the \"Domain\" values below and add \"Kind\" once C++\n        # identification is re-enabled:\n        self.assertHasRow(heap_out,\n                          [('Count', 100), ('Domain', 'uncategorized')])\n        self.assertHasRow(heap_out,\n                          [('Count', 50),  ('Domain', 'uncategorized')])\n\n    def test_history(self):\n        src = TestSource()\n        src.add_malloc(100)\n        src.add_malloc(100)\n        src.add_malloc(100)\n        src.add_breakpoint()\n\n\n        src.add_malloc(200)\n        src.add_malloc(200)\n        src.add_malloc(200)\n        src.add_breakpoint()\n        source = src.as_c_source()\n\n        out = self.program_test('test_history', source, \n                                commands=['run', 'heap sizes', 'heap label foo', 'cont', 'heap log', 'heap diff'])\n        #print out\n        # FIXME\n\n\n    def assertHasRow(self, table, kvs):\n        return table.find_row(kvs)\n        # ...which will raise a RowNotFound exception if there's a problem\n\n    def assertFoundCategory(self, table, domain, kind, detail=None):\n        # Ensure that the result table has a row of the given category\n        # (or raise RowNotFound)\n        kvs = [('Domain', domain),\n               ('Kind', kind)]\n        if detail:\n            kvs.append( ('Detail', detail) )\n\n        self.assertHasRow(table, kvs)\n\n    def test_assertions(self):\n        # Ensure that the domain-specific assertions work\n        tables = ParsedTable.parse_lines(test_table)\n        self.assertEquals(len(tables), 2)\n        pt = tables[0]\n\n        self.assertHasRow(pt, [('Domain', 'python'), ('Kind', 'str')])\n        self.assertRaises(RowNotFound,\n                          lambda: self.assertHasRow(pt, [('Domain', 'ruby')]))\n\n        self.assertFoundCategory(pt, 'python', 'str')\n        self.assertRaises(RowNotFound,\n                          lambda: self.assertFoundCategory(pt, 'ruby', 'class'))\n\n    def test_gobject(self):\n        out = self.command_test(['gtk-demo'],\n                                commands=['set breakpoint pending yes',\n                                          'set environment G_SLICE=always-malloc', # for now\n                                          'break gtk_main',\n                                          'run',\n                                          'heap',\n                                          ])\n        # print out\n\n        tables = ParsedTable.parse_lines(out)\n        heap_out = tables[0]\n\n        # Ensure that instances of GObject classes are categorized:\n        self.assertFoundCategory(heap_out, 'GType', 'GtkTreeView')\n        self.assertFoundCategory(heap_out, 'GType', 'GtkLabel')\n\n        # Ensure that instances of fundamental boxed types are categorized:\n        self.assertFoundCategory(heap_out, 'GType', 'gchar')\n        self.assertFoundCategory(heap_out, 'GType', 'guint')\n\n        # Ensure that the code detected buffers used by the GLib/GTK types:\n        self.assertFoundCategory(heap_out,\n                                 'GType', 'GdkPixbuf pixels', '107w x 140h')\n\n        # GdkImage -> X11 Images -> data:\n        self.assertFoundCategory(heap_out, 'GType', 'GdkImage')\n        self.assertFoundCategory(heap_out, 'X11', 'Image')\n        if False:\n            # Only seen whilst using X forwarded over ssh:\n            self.assertFoundCategory(heap_out, 'X11', 'Image data')\n        # In both above rows, \"Detail\" contains the exact dimensions, but these\n        # seem to vary with the resolution of the display the test is run\n        # against\n\n        # FreeType:\n        # These seem to be highly dependent on the environment; I originally\n        # developed this whilst using X forwarded over ssh\n        if False:\n            self.assertFoundCategory(heap_out, 'GType', 'PangoCairoFcFontMap')\n            self.assertFoundCategory(heap_out, 'FreeType', 'Library')\n            self.assertFoundCategory(heap_out, 'FreeType', 'raster_pool')\n\n    def test_python2(self):\n        self._impl_test_python('python2', py3k=False)\n\n    def test_python3(self):\n        self._impl_test_python('python3', py3k=True)\n\n    def _impl_test_python(self, pyruntime, py3k):\n        # Test that we can debug CPython's memory usage, for a given runtime\n\n        # Invoke a test python script, stopping at a breakpoint\n        out = self.command_test([pyruntime, 'object-sizes.py'],\n                                commands=['set breakpoint pending yes',\n                                          'break builtin_id',\n                                          'run',\n                                          'heap cpython-allocators',\n                                          'heap',\n                                          'heap select kind=\"PyListObject ob_item table\"'],\n                                breakpoint='builtin_id')\n\n        # Re-enable this for debugging:\n        # print out\n\n        tables = ParsedTable.parse_lines(out)\n\n        # Verify that \"cpython-allocators\" works:\n        allocators_out = tables[0]\n        self.assertEquals(allocators_out.colnames,\n                          ('struct arena_object*',\n                           '256KB buffer location',\n                           'Free pools'))\n\n        # print allocators_out\n        # self.assertHasRow(allocators_out,\n        #                  kvs = [('Domain', 'cpython'),\n        #                         ('Kind', 'PyListObject ob_item table')])\n\n        heap_out = tables[1]\n\n        # Verify that \"select\" works for a category that's only detectable\n        # w.r.t. other categories:\n        select_out = tables[2]\n        # print select_out\n        self.assertHasRow(select_out,\n                          kvs = [('Domain', 'cpython'),\n                                 ('Kind', 'PyListObject ob_item table')])\n        \n        # Ensure that the code detected instances of various python types we\n        # expect to be present:\n        for kind in ('str', 'list', 'tuple', 'dict', 'type', 'code',\n                     'set', 'frozenset', 'function', 'module', 'frame', ):\n            self.assertFoundCategory(heap_out, 'python', kind)\n\n        if py3k:\n            self.assertFoundCategory(heap_out, 'python', 'bytes')\n        else:\n            self.assertFoundCategory(heap_out, 'python', 'unicode')\n\n        # Ensure that the blocks of int allocations are detected:\n        if not py3k:\n            self.assertFoundCategory(heap_out, 'cpython', '_intblock', '')\n\n        # Ensure that bytecode \"strings\" are marked as such:\n        self.assertFoundCategory(heap_out, 'python', 'str', 'bytecode') # FIXME\n\n        # Ensure that old-style classes are printed with a meaningful name\n        # (i.e. not just \"type\"):\n        if not py3k:\n            for clsname in ('OldStyle', 'OldStyleManyAttribs'):\n                self.assertFoundCategory(heap_out,\n                                         'python', clsname, 'old-style')\n\n                # ...and that their instance dicts are marked:\n                self.assertFoundCategory(heap_out,\n                                         'cpython', 'PyDictObject',\n                                         '%s.__dict__' % clsname)\n\n        # ...and that an old-style instance with enough attributes to require a\n        # separate PyDictEntry buffer for its __dict__ has that buffer marked\n        # with the typename:\n        self.assertFoundCategory(heap_out,\n                                 'cpython', 'PyDictEntry table',\n                                 'OldStyleManyAttribs.__dict__')\n\n        # Likewise for new-style classes:\n        for clsname in ('NewStyle', 'NewStyleManyAttribs'):\n            self.assertHasRow(heap_out,\n                              [('Domain', 'python'),\n                               ('Kind',   clsname),\n                               ('Detail', None)])\n            self.assertFoundCategory(heap_out,\n                              'python', 'dict', '%s.__dict__' % clsname)\n        self.assertFoundCategory(heap_out,\n                                 'cpython', 'PyDictEntry table',\n                                 'NewStyleManyAttribs.__dict__')\n\n        # Ensure that the code detected buffers used by python types:\n        for kind in ('PyDictEntry table', 'PyListObject ob_item table',\n                     'PySetObject setentry table',\n                     'PyUnicodeObject buffer', 'PyDictEntry table'):\n            self.assertFoundCategory(heap_out,\n                                     'cpython', kind)\n\n        # and of other types:\n        self.assertFoundCategory(heap_out,\n                                 'C', 'string data')\n        self.assertFoundCategory(heap_out,\n                                 'pyarena', 'pool_header overhead')\n\n        # Ensure that the \"interned\" table is identified (it's typically\n        # at least 200k on a 64-bit build):\n        self.assertHasRow(heap_out,\n                          [('Domain', 'cpython'),\n                           ('Kind',   'PyDictEntry table'),\n                           ('Detail', 'interned'),\n                           ('Count',  1)])\n\n\n        # Ensure that we detect python sqlite3 objects:\n        for kind in ('sqlite3.Connection', 'sqlite3.Statement',\n                     'sqlite3.Cache'):\n            self.assertFoundCategory(heap_out,\n                                     'python', kind)\n        # ...and that we detect underlying sqlite3 buffers:\n        for kind in ('sqlite3', 'sqlite3_stmt'):\n            self.assertFoundCategory(heap_out,\n                                     'sqlite3', kind)\n\n    def test_pypy(self):\n        # Try to investigate memory usage of pypy-c\n        # Developed using pypy-1.4.1 as packaged on Fedora.\n        #\n        # In order to get meaningful data, let's try to trap the exit point\n        # of pypy-c within gdb.\n        #\n        # For now, lets try to put a breakpoint in this location within the\n        # generated \"pypy_g_entry_point\" C function:\n        #   print_stats:158 :         debug_stop(\"jit-summary\")\n        out = self.command_test(['pypy', 'object-sizes.py'],\n                                commands=['set breakpoint pending yes',\n\n                                          'break pypy_debug_stop',\n                                          'condition 1 0==strcmp(category, \"jit-summary\")',\n\n                                          'run',\n                                          'heap',\n                                          ])\n        tables = ParsedTable.parse_lines(out)\n        select_out = tables[0]\n\n    def test_select(self):\n        # Ensure that \"heap select\" with no query does something sane\n        src = TestSource()\n        for i in range(3):\n            src.add_malloc(1024)\n        src.add_breakpoint()\n        source = src.as_c_source()\n\n        out = self.program_test('test_select', source,\n                                commands=['run',\n                                          'heap select',\n                                          ])\n        tables = ParsedTable.parse_lines(out)\n        select_out = tables[0]\n\n        # The \"heap select\" command should select all blocks:\n        self.assertEquals(select_out.colnames,\n                          ('Start', 'End', 'Domain', 'Kind', 'Detail', 'Hexdump'))\n        self.assertEquals(len(select_out.rows), 3)\n\n\n        # Test that syntax errors are well handled:\n        out = self.program_test('test_select', source,\n                                commands=['run',\n                                          'heap select I AM A SYNTAX ERROR',\n                                          ])\n        errmsg = '''\nParse error at \"AM\":\nI AM A SYNTAX ERROR\n  ^^\n'''\n        if errmsg not in out:\n            self.fail('Did not find expected \"ParseError\" message in:\\n%s' % out)\n\n        # Test that unknown attributes are well-handled:\n        out = self.program_test('test_select', source,\n                                commands=['run',\n                                          'heap select NOT_AN_ATTRIBUTE > 42',\n                                          ])\n        errmsg = '''\nUnknown attribute \"NOT_AN_ATTRIBUTE\" (supported are domain,kind,detail,addr,start,size) at \"NOT_AN_ATTRIBUTE\":\nNOT_AN_ATTRIBUTE > 42\n  ^^^^^^^^^^^^^^^^\n'''\n        if errmsg not in out:\n            self.fail('Did not find expected \"Unknown attribute\" error message in:\\n%s' % out)\n\n        # Ensure that ply did not create debug files (ticket #12)\n        for filename in ('parser.out', 'parsetab.py'):\n            if os.path.exists(filename):\n                self.fail('Unexpectedly found file %r' % filename)\n\n    def test_select_by_size(self):\n        src = TestSource()\n        # Allocate ten 1kb blocks, nine 2kb blocks, etc, down to one 10kb\n        # block so that we can easily query them by size:\n        for i in range(10):\n            for j in range(10-i):\n                size = 1024 * (i+1)\n                src.add_malloc(size)\n        src.add_breakpoint()\n        source = src.as_c_source()\n\n        out = self.program_test('test_select_by_size', source,\n                                commands=['run',\n                                          'heap',\n\n                                          'heap select size >= 10240',\n                                          # (parsed as \"largest_out\" below)\n\n                                          'heap select size < 2048',\n                                          # (parsed as \"smallest_out\" below)\n\n                                          'heap select size >= 4096 and size < 8192',\n                                          # (parsed as \"middle_out\" below)\n                                          ])\n        tables = ParsedTable.parse_lines(out)\n        heap_out = tables[0]\n        largest_out = tables[1]\n        smallest_out = tables[2]\n        middle_out = tables[3]\n\n        # The \"heap\" command should find all the allocations:\n        self.assertHasRow(heap_out,\n                          [('Detail', 'TOTAL'), ('Count', 55)])\n\n        # The query for the largest should find just one allocation:\n        self.assertEquals(len(largest_out.rows), 1)\n\n        # The query for the smallest should find ten allocations:\n        self.assertEquals(len(smallest_out.rows), 10)\n\n        # The middle query [4096, 8192) should capture the following\n        # allocations:\n        #   7 of (4*4096), 6 of (5*4096), 5 of (6*4096) and 4 of (7*4096)\n        # giving a total count of 7+6+5+4 = 22\n        self.assertEquals(len(middle_out.rows), 22)\n\n    def test_select_by_category(self):\n        out = self.command_test(['python', '-c', 'id(42)'],\n                                commands=['set breakpoint pending yes',\n                                          'break builtin_id',\n                                          'run',\n                                          'heap select domain=\"python\" and kind=\"str\" and size > 512'],\n                                breakpoint='builtin_id')\n\n        tables = ParsedTable.parse_lines(out)\n        select_out = tables[0]\n\n        # Ensure that the filtering mechanism worked:\n        if len(select_out.rows) < 10:\n            self.fail(\"Didn't find any large python strings (has something gone wrong?) in: %s\" % select_out)\n        for row in select_out.rows:\n            self.assertEquals(row[2], 'python')\n            self.assertEquals(row[3], 'str')\n\n    def test_heap_used(self):\n        # Ensure that \"heap used\" works\n        src = TestSource()\n        for i in range(3):\n            src.add_malloc(1024)\n        src.add_breakpoint()\n        source = src.as_c_source()\n\n        out = self.program_test('test_heap_used', source,\n                                commands=['run',\n                                          'heap used',\n                                          ])\n        # FIXME: do some verification of the output\n\n    def test_heap_all(self):\n        # Ensure that \"heap all\" works\n        src = TestSource()\n        for i in range(3):\n            src.add_malloc(1024)\n        src.add_breakpoint()\n        source = src.as_c_source()\n\n        out = self.program_test('test_heap_all', source,\n                                commands=['run',\n                                          'heap all',\n                                          ])\n        # FIXME: do some verification of the output\n\n\nfrom heap.parser import parse_query\nfrom heap.query import Constant, And, Or, Not, GetAttr, \\\n    Comparison__le__, Comparison__lt__, Comparison__eq__, \\\n    Comparison__ne__, Comparison__ge__, Comparison__gt__\n\nclass QueryParsingTests(unittest.TestCase):\n    def assertParsesTo(self, s, result):\n        self.assertEquals(parse_query(s), result)\n\n    def test_simple_comparisons(self):\n        self.assertParsesTo('size >= 1024',\n                            Comparison__ge__(GetAttr('size'), Constant(1024)))\n\n        # Check that hexadecimal numeric literals are parsed:\n        self.assertParsesTo('addr > 0xbf70ffff',\n                            Comparison__gt__(GetAttr('addr'), Constant(0xbf70ffff)))\n\n        # Check that string literals are parsed:\n        self.assertParsesTo('kind == \"str\"',\n                            Comparison__eq__(GetAttr('kind'), Constant('str')))\n\n        # Check \"and\":\n        self.assertParsesTo('kind == \"str\" and size > 1024',\n                            And(Comparison__eq__(GetAttr('kind'), Constant('str')),\n                                Comparison__gt__(GetAttr('size'), Constant(1024))))\n\n        # Check \"or\":\n        self.assertParsesTo('size > 10000 and not domain=\"uncategorized\"',\n                            And(Comparison__gt__(GetAttr('size'), Constant(10000)),\n                                Not(Comparison__eq__(GetAttr('domain'), Constant('uncategorized')))))\n\n        # Do we want algebraic support?\n        #self.assertParsesTo('size == (256 * 1024)+8',\n        #                    Comparison('size', '==', 1024L))\n\n\nif __name__ == \"__main__\":\n    unittest.main()\n"
  }
]