GEPS 016: Enhancing Gramps Processing Speed

From Gramps
Jump to: navigation, search

This is a GEP for examining, detailing, and proposing enhancements to address speed issues in Gramps.

Enhancing Speed Workflow

The first step in working toward speeding up Gramps is to identify particular functions or activities that take more time than you would like.

The next is to examine in detail what functions are the bottle-necks in the processing. This is most easily and scientifically done through Python's Profiling tools.

Finally, developers can rewrite functions to speed up processing.

Areas of Speed Enhancement

Importing

See a discussion on Glacial imports (2009).

A gzipped profile log of importing a gramps file: Media:Profile-log-of-importing-a-gramps-file.zip

Profiling

Here is an example of starting up a command-line function and profiling it:

python /usr/lib/python2.7/cProfile.py -o profile-log src/gramps.py -i ~/Desktop/import-test.gramps

or

python3 /usr/lib/python3.4/profile.py -o profile-log Gramps.py -i ~/example/gramps/data.gramps

This example will save the profile output into a file called profile-log which can then be analyzed using the Stats class (described below).

This can take quite a bit of time. Note: importing does not automatically exit gramps. You should immediately exit, so that the code of running the GUI does not effect the profile of the import.

Profile Analysis

To analyze a profile log you can:

>>> import pstats
>>> profile_stats = pstats.Stats('profile-log')
>>> profile_stats.sort_stats('cumulative').print_stats(50)
Sat Oct 17 09:07:04 2009    import-profile.log

         19246501 function calls (19203079 primitive calls) in 481.667 CPU seconds

   Ordered by: cumulative time
   List reduced from 4101 to 50 due to restriction <50>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000  481.669  481.669 <string>:1(<module>)
        1    0.060    0.060  481.669  481.669 {execfile}
        1    0.136    0.136  481.609  481.609 src/gramps.py:29(<module>)
        1    0.066    0.066  479.398  479.398 src/gramps.py:140(run)
        1    0.000    0.000  478.916  478.916 /home/dblank/gramps/trunk/src/gui/grampsgui.py:354(startgtkloop)
        1    2.798    2.798  478.916  478.916 {gtk._gtk.main}
        1    0.001    0.001  466.958  466.958 /home/dblank/gramps/trunk/src/gui/grampsgui.py:288(__startgramps)
        1    0.068    0.068  466.801  466.801 /home/dblank/gramps/trunk/src/gui/grampsgui.py:237(__init__)
        1    0.000    0.000  448.359  448.359 /home/dblank/gramps/trunk/src/cli/arghandler.py:231(handle_args_gui)
        1    0.000    0.000  448.359  448.359 /home/dblank/gramps/trunk/src/cli/arghandler.py:328(__import_action)
        1    0.000    0.000  444.330  444.330 /home/dblank/gramps/trunk/src/cli/arghandler.py:402(cl_import)
        1    0.000    0.000  411.205  411.205 /home/dblank/gramps/trunk/src/plugins/import/ImportXml.py:85(importData)
        1    0.000    0.000  399.556  399.556 /home/dblank/gramps/trunk/src/plugins/import/ImportXml.py:734(parse)
        8    7.274    0.909  394.007   49.251 {built-in method ParseFile}
    48416    2.224    0.000  300.896    0.006 /home/dblank/gramps/trunk/src/gen/db/write.py:1349(commit_base)
    95167    2.149    0.000  279.076    0.003 /usr/lib/python2.6/bsddb/dbshelve.py:256(put)
    95167  265.444    0.003  272.442    0.003 {method 'put' of 'DB' objects}
   131354    1.247    0.000  223.661    0.002 ./Modules/pyexpat.c:656(EndElement)
   130749    2.319    0.000  222.406    0.002 /home/dblank/gramps/trunk/src/plugins/import/ImportXml.py:2353(endElement)
   262103    1.830    0.000  152.875    0.001 ./Modules/pyexpat.c:604(StartElement)
   130749    2.840    0.000  150.665    0.001 /home/dblank/gramps/trunk/src/plugins/import/ImportXml.py:2340(startElement)
    48416    2.309    0.000  119.435    0.002 /home/dblank/gramps/trunk/src/gen/db/write.py:782(update_reference_map)
    46694    1.891    0.000  111.009    0.002 /home/dblank/gramps/trunk/src/gen/db/write.py:872(__add_reference)
    20660    0.574    0.000  104.678    0.005 /home/dblank/gramps/trunk/src/gen/db/write.py:1484(commit_event)
    24208    0.431    0.000   92.818    0.004 /home/dblank/gramps/trunk/src/gen/db/write.py:1083(__add_object)
    10330    0.703    0.000   80.802    0.008 /home/dblank/gramps/trunk/src/plugins/import/ImportXml.py:1979(stop_event)
    11390    1.210    0.000   80.009    0.007 /home/dblank/gramps/trunk/src/gen/db/write.py:1375(commit_person)
     5695    0.217    0.000   64.056    0.011 /home/dblank/gramps/trunk/src/plugins/import/ImportXml.py:2103(stop_person)
    10560    0.231    0.000   63.705    0.006 /home/dblank/gramps/trunk/src/gen/db/write.py:1544(commit_note)
    24208    1.188    0.000   60.380    0.002 /home/dblank/gramps/trunk/src/plugins/lib/libmixin.py:53(find_from_handle)
    32412    0.825    0.000   51.891    0.002 /home/dblank/gramps/trunk/src/plugins/lib/libmixin.py:77(__check_from_handle)
     3588    0.285    0.000   49.109    0.014 /home/dblank/gramps/trunk/src/gen/db/write.py:1497(commit_family)
     1794    0.092    0.000   40.556    0.023 /home/dblank/gramps/trunk/src/plugins/import/ImportXml.py:1957(stop_family)
     5280    0.054    0.000   36.870    0.007 /home/dblank/gramps/trunk/src/gen/db/write.py:1188(add_note)
     4185    0.168    0.000   36.580    0.009 /home/dblank/gramps/trunk/src/plugins/import/ImportXml.py:1445(start_noteref)
        2    0.000    0.000   36.478   18.239 /home/dblank/gramps/trunk/src/cli/grampscli.py:189(open_activate)
        2    0.000    0.000   36.478   18.239 /home/dblank/gramps/trunk/src/cli/grampscli.py:202(_read_recent_file)
     4185    0.080    0.000   36.313    0.009 /home/dblank/gramps/trunk/src/plugins/lib/libmixin.py:269(check_note_from_handle)
    10330    0.960    0.000   35.070    0.003 /home/dblank/gramps/trunk/src/plugins/import/ImportXml.py:921(start_event)
    43/15    0.003    0.000   33.381    2.225 /home/dblank/gramps/trunk/src/gen/utils/callback.py:316(emit)
    10330    0.122    0.000   32.639    0.003 /home/dblank/gramps/trunk/src/plugins/lib/libmixin.py:111(find_event_from_handle)
        2    0.000    0.000   31.979   15.990 /home/dblank/gramps/trunk/src/gui/viewmanager.py:1072(_post_load_newdb)
        2    0.010    0.005   31.921   15.961 /home/dblank/gramps/trunk/src/cli/grampscli.py:234(_post_load_newdb_nongui)
       18    0.000    0.000   30.421    1.690 /home/dblank/gramps/trunk/src/DbState.py:51(change_active_person)
        6    0.000    0.000   30.381    5.063 /home/dblank/gramps/trunk/src/DataViews/GrampletView.py:360(_active_changed)
       21    0.001    0.000   29.720    1.415 /home/dblank/gramps/trunk/src/DataViews/GrampletView.py:483(update)
        5    0.001    0.000   29.720    5.944 /home/dblank/gramps/trunk/src/plugins/gramplet/PedigreeGramplet.py:224(main)
        1    0.000    0.000   29.671   29.671 /home/dblank/gramps/trunk/src/plugins/gramplet/PedigreeGramplet.py:87(active_changed)
    891/3    0.044    0.000   29.575    9.858 /home/dblank/gramps/trunk/src/plugins/gramplet/PedigreeGramplet.py:122(process_person)
     5280    0.254    0.000   27.571    0.005 /home/dblank/gramps/trunk/src/plugins/import/ImportXml.py:2236(stop_note)

To interpret the stats, you must realize that the function calls reflect a hierarchical structure. This src/gramps.py uses the most time as it is the topmost program. So, you may have to go down quite a ways to see the lower-level functions that are actually responsible for the processing. For example, /home/dblank/gramps/trunk/src/gen/db/write.py:1349(commit_base) may be a function that could be speed up.

Testing one module

An alternative to make a profile on a gramps session is to test one section of an existing module.

import cProfile
pr = cProfile.Profile()
pr.enable()
# your code here
pr.disable()
pr.print_stats(sort='time')

Additionnal tips

python 
>>> import hotshot, hotshot.stats, test.pystone 
>>> stats = hotshot.stats.load('profile-log')
>>> stats.strip_dirs() 
>>> stats.sort_stats('time', 'calls') 
>>> stats.print_stats(200) 
>>> stats.sort_stats('cumulative', 'calls') 
>>> stats.print_stats(200) 
>>> stats.sort_stats('calls') 
>>> stats.print_stats(200) 
>>> stats.sort_stats('ncalls', 'calls') 
>>> stats.print_stats(200) 
>>> stats.sort_stats('pcalls', 'calls') 
>>> stats.print_stats(200)

Profile GUI

For exploring easier, you can also try some tools like RunSnakeRun.

RunSnakeRun with Gramps profile loaded
RunSnakeRun Navigation
RunSnakeRun Navigation (top level)


Profile Links

Patches and Fixes

  • Bug#4304 Pedigree gramplet - Resolved 2011-02-18
  • Bug#4428 XML import - Resolved 2011-01-25 rev16449/50
  • Bug#8144 Uncollected Objects and profiles - Closed 2016

See also

External links