Addon:Lxml Gramplet

From Gramps
Jump to: navigation, search
Gramps-notes.png

Please use carefully on data that is backed up, and help make it better by reporting any comments or problems to the author, or issues to the bug tracker
Unless otherwise stated on this page, you can download this addon by following these instructions.
Please note that some Addons have prerequisites that need to be installed before they can be used.
This Addon/Plugin system is controlled by the Plugin Manager.
An Addons Offline Manual is available for review.

Gnome-important.png

You can get a copy of from the Addon repository:

https://github.com/gramps-project/addons-source/tree/master/lxml

lxml Gramplet addon detached from Dashboard
etree Gramplet addon detached from Dashboard
lxml Gramplet addon - "Is it a .gramps?" - warning message
lxml Gramplet addon - "XSD validation (lxml)" - warning message

lxml Gramplet is an experimental gramplet working under POSIX platform(s), which reads, writes (not the original one; safe read only state), transforms content of our Gramps XML file on the fly without an import into our database (Gramps session).

Includes the etree Gramplet for testing the Python ElementTree module(etree) with Gramps XML.


Usage

Add these addons to the Gramps Dashboard.

As these addons are standalone, you do not need an open family tree/database to use them.

Load a saved Gramps XML backup ( *.gramps ) file and select the Run button.

If the Gramps XML backup ( *.gramps )file is not valid you will get an "Is it a .gramps?" [file] warning message otherwise the file is copied and the lxml Gramplet attempts XSD validation (lxml) before continuing.


lxml Gramplet

  • lxml Gramplet - Gramplet for testing lxml and XSLT

Currently, this addon quickly explores multiple ways. Feel free to modify for your own use.

For testing only, by design these actions are not for production.

Goals

The idea of this experimental lxml gramplet is to provide a way for using basic lxml features with Gramps XML files.

XPath, Xslt, XML dump, RelaxNG and XSD validations, can be used and done by lxml, which provides an API very close to etree ElementTree module from python 2.5 and later.

The experimental lxml gramplet aims to use these lxml features[1] by parsing a Gramps XML file generated by Gramps 3.4.x (or 3.3.x) and to generate an output sample, using open W3C standards (XML, Web design, Web services, etc ...).


[1] see also lxml.objectify


etree Gramplet

  • etree Gramplet - Gramplet for testing etree with Gramps XML

Goals

Includes the etree Gramplet for testing the Python ElementTree module(etree) with Gramps XML.

Installation

Manual installation is required, you can get a copy of this Addon from the repository:

https://github.com/gramps-project/addons-source/tree/master/lxml

Prerequisites

lxml Gramplet

Before the lxml Gramplet can be used you will need the following prerequisites installed:

Both are known for good speed performances by using C-level (Cython).

etree Gramplet

No extra prerequisites are required.

Gramps XML file format

The Gramps XML file format is robust and well documented.

Example Screenshot results

  1. Titles, labels and footer are translated (written on python code).
  2. Full separation of presentation and content for the generation.


  • Local output with custom XML data in buffer and XSLT transformation
Dynamic output


  • Local output without stylesheet
Dynamic output without stylesheet


Within Gramps


  • Pseudo dynamic code generation (xml + xslt = html file)
Dynamic code generation


  • Action on surname (sort, remove duplicated)
Sorted surnames list


  • Action on place title (sort, enable cross search on place fields)
Sorted places list


  • Hardcoded list written in python and translated by Gramps into our locale (if translation exists)
Hardcoded list (gramps translations)


Further development

Bibliography gramplet ?

  • CherryTree is an hierarchical note taking application, featuring rich text and syntax highlighting, storing all the data (including images) in a single xml file with extension .ctd, which has planned to also implement an integration with zotero content.
  • Zim is a graphical text editor used to maintain a collection of wiki pages. All pages you create in zim are saved as plain text files with wiki formatting. This means that you can access your content with any other editor or file manager without being dependent on zim. You can even have your pages in a revision control system like CVS or use a Makefile to compile your notes into a webpage. Any images you add are just image files which are linked from the text files. This means that zim can call your standard programs to edit images. When you embed an image in a page the context menu for the image will offer to open it with whatever image manipulation programs you have installed. After editing you just reload the page to see the result. See also third party contributions.

Collaborative indexes

  • Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing. It should support Gramps XML, Gramps CSV and Gramps JSON.

Clients library for FamilySearch API

Serialization for C client library or Objective C Client library is done in conjunction with libxml2.

Comments on DB API Idea

I was basically approaching it from the leave gen.lib alone and
implement a "fully blown" SimpleAccess-esque solution.

At the moment I basically have a 'DB' object which represents an open
database. This at the moment is populated from a Gramps XML file. This
is then basically stored as lxml.objectify objects. Internally a graph
structure is built to represent the linking inside the database (so
relationships and ref. integrity is made easier).

'DBItem' objects consist of the 'node' data, the basic save/delete
etc... Deleting an event automatically removes all other references to
it (which has caught me out previously).

class Person(DBItem):
    DBTYPE = 'person'

Basically registers an object that 'wraps' a basic DBItem, but
containing useful attributes/methods. So for a person, we can write
attributes such as .birth, .mother, .families etc... etc... It can also
over-ride how it should be saved/retrieved etc...

I chose this approach because it keeps the process incremental. We can
still access the 'raw' data in a DBItem for the stuff I'm not caring
about at the moment, but someone can write a 'Place' class later for
instance.

The DB itself is an xpath queryable object (adds a bit of flexibility
for selections that don't have convenient attributes as of yet).

I'll see if I can get the code example out this week.

Anyway, does this seem a reasonable approach? 

source: Archive (Dec 07, 2009) on gramps-devel mailing list

Database compare and merge

  • GrampsCompare.py, a python script for comparing data in 2 Gramps XML files.

source: Archive (Oct 02, 2011) on gramps-devel mailing list

Database backend

Data transfer

  • Akara is a platform for developing data services available on the Web, using REST architecture. Akara is open source software written in Python and C. eg, Recollection project for the Library of Congress. See the user guide or screencasts (shockwave flash) [2], [3], [4].

Environment

Faceted classification

HTML class

  • Gramps

Libhtml is an HTML/XML class for Gramps, see API.

  • Gtk3

GTK+3 provides an GDK Broadway / HTML backend that allows GTK applications to run natively within an HTML5 web navigator.

See sample1, sample2, sample3.

Interface

Performances

See Gramps performances for comparison on large datasets between different Gramps versions.

Web applications

  • GEPS 013 describes a web-based application that runs in your browser, and requires a server. A prototype is now on-line at http://gramps-connect.org/ which is running trunk on a sample database (id=admin1, password=gramps).
  • DenominoViso plugin for GRAMPS is a third party plugin that creates an interactive graphical representation of a family tree. DenominoViso creates a graphical webpage in SVG/XHTML/javascript.

XQuery

"Or something close to SQL like XQuery so you can do querys on Gramps XML database similar to SQL Query. It can works even in internet browser thru plugins. XML is quite self-explanatory. Zorba provide python bindings for XQuery."
source
Archive (Oct 28, 2009) on gramps-user mailing list

Issues

  • Help buttons for both Gramplets do not go to this page.

lxml Gramplet

  • With the "lxml Gramplet" if you don't load a valid file or just press run without selecting a file you get an error message popup that states "Is it a .gramps ?" and "Cannot copy "D:\filepath\...\test.gramps" then the "Error Report wizard is shown with the error message "110858: ERROR:lxmlGramplet.py line 284:Cannot copy the file"
  • With the "lxml Gramplet" loading a *.gramps file causes:
201660: ERROR: grampsapp.py: line 174: Unhandled exception
Traceback (most recent call last):
  File "D:\PortableApps\GrampsPortable\Data\settings\gramps\gramps51\plugins\lxml\lxmlGramplet.py", line 295, in ReadXML
    self.xsd(xsd, filename)
  File "D:\PortableApps\GrampsPortable\Data\settings\gramps\gramps51\plugins\lxml\lxmlGramplet.py", line 540, in xsd
    tree = etree.parse(filename)
  File "src/lxml/etree.pyx", line 3444, in lxml.etree.parse (src/lxml/etree.c:83170)
  File "src/lxml/parser.pxi", line 1834, in lxml.etree._parseDocument (src/lxml/etree.c:120742)
  File "src/lxml/parser.pxi", line 1860, in lxml.etree._parseDocumentFromURL (src/lxml/etree.c:121089)
  File "src/lxml/parser.pxi", line 1764, in lxml.etree._parseDocFromFile (src/lxml/etree.c:119997)
  File "src/lxml/parser.pxi", line 1161, in lxml.etree._BaseParser._parseDocFromFile (src/lxml/etree.c:114546)
  File "src/lxml/parser.pxi", line 598, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/etree.c:107723)
  File "src/lxml/parser.pxi", line 709, in lxml.etree._handleParseResult (src/lxml/etree.c:109432)
  File "src/lxml/parser.pxi", line 638, in lxml.etree._raiseParseError (src/lxml/etree.c:108286)
  File "file:/D:/PortableApps/GrampsPortable/Data/settings/gramps/gramps51/plugins/lxml/test.xml", line 1
lxml.etree.XMLSyntaxError: Document is empty, line 1, column 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\PortableApps\GrampsPortable\Data\settings\gramps\gramps51\plugins\lxml\lxmlGramplet.py", line 239, in run
    self.ReadXML(entry)
  File "D:\PortableApps\GrampsPortable\Data\settings\gramps\gramps51\plugins\lxml\lxmlGramplet.py", line 299, in ReadXML
    LOG.debug(self.xsd(xsd, filename))
  File "D:\PortableApps\GrampsPortable\Data\settings\gramps\gramps51\plugins\lxml\lxmlGramplet.py", line 540, in xsd
    tree = etree.parse(filename)
  File "src/lxml/etree.pyx", line 3444, in lxml.etree.parse (src/lxml/etree.c:83170)
  File "src/lxml/parser.pxi", line 1834, in lxml.etree._parseDocument (src/lxml/etree.c:120742)
  File "src/lxml/parser.pxi", line 1860, in lxml.etree._parseDocumentFromURL (src/lxml/etree.c:121089)
  File "src/lxml/parser.pxi", line 1764, in lxml.etree._parseDocFromFile (src/lxml/etree.c:119997)
  File "src/lxml/parser.pxi", line 1161, in lxml.etree._BaseParser._parseDocFromFile (src/lxml/etree.c:114546)
  File "src/lxml/parser.pxi", line 598, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/etree.c:107723)
  File "src/lxml/parser.pxi", line 709, in lxml.etree._handleParseResult (src/lxml/etree.c:109432)
  File "src/lxml/parser.pxi", line 638, in lxml.etree._raiseParseError (src/lxml/etree.c:108286)
  File "file:/D:/PortableApps/GrampsPortable/Data/settings/gramps/gramps51/plugins/lxml/test.xml", line 1
lxml.etree.XMLSyntaxError: Document is empty, line 1, column 1

etree Gramplet

  • with etree gramplet loading a *.gramps file causes:
65808: ERROR: grampsapp.py: line 174: Unhandled exception
Traceback (most recent call last):
  File "D:\PortableApps\GrampsPortable\Data\settings\gramps\gramps51\plugins\lxml\etreeGramplet.py", line 223, in run
    self.ReadXML(entry)
  File "D:\PortableApps\GrampsPortable\Data\settings\gramps\gramps51\plugins\lxml\etreeGramplet.py", line 265, in ReadXML
    tree = ElementTree.parse(filename)
  File "AIO/xml/etree/ElementTree.py", line 1196, in parse
  File "AIO/xml/etree/ElementTree.py", line 597, in parse
  File "<string>", line None
xml.etree.ElementTree.ParseError: no element found: line 1, column 0