Table Of Contents

Previous topic

Development

This Page

Architecture

Top Level

There are four database schemas available in datazilla: datazilla, schema_hgmozilla.sql.tmpl, schema_objectstore.sql.tmpl, and schema_perftest.sql.tmpl. Three of these are template schemas that are used by manage commands to create new databases with the storage engine specified by the user. Access to each schema is provided through the Model layer. The model layer is used by controllers to retrieve data in each of the schemas and is exposed to the user through a set of web service methods.

datazilla

This schema is accessed using the django ORM, the model for it is defined here. and consists of a single table with the following structure.

Field Type Null Key Default Extra
id int(11) NO PRI NULL auto_increment
project varchar(25) NO MUL NULL  
contenttype varchar(25) NO   NULL  
dataset int(11) NO   NULL  
host varchar(128) NO   NULL  
read_only_host varchar(128) YES   NULL  
name varchar(128) NO   NULL  
type varchar(25) NO   NULL  
oauth_consumer_key varchar(45) YES   NULL  
oauth_consumer_secret varchar(45) YES   NULL  
creation_date datetime NO   NULL  
cron_batch varchar(45) YES   small  

All databases storing data used by datazilla are stored as a row in this table. Each database has three classifiers associated with it: project, contenttype, and dataset. The name of the database is typically these three classifiers joined on an underscore but there is no requirement for this, the name can be any string. There is no physical requirement for the databases referenced in this table to be co-located. The only requirement is that both the web service and machine that run’s the cron jobs have access to each of the databases in this table. Any database can have OAuth credentials associated with it but they are not required so the field can be null. Currently the only databases that require OAuth are the objectstore and only for the storage of the JSON object. Each database can also have a cron batch interval associated with it. This interval specifies the time interval of cron jobs run.

project - A descriptive string associated with the project: talos, b2g, schema etc... This string becomes the location field in the url for related web service methods.

contenttype - A string describing the content type associated with the database. The perftest content type stores performance test results, the objectstore content type stores JSON objects in it. A project can have any number of contenttypes associated with it.

dataset - An integer that can be enumerated. This allows more than one database to exist for the same project/contenttype pair.

host - Name of the database host.

read_only_host - A read only host associated with the database.

name - Name of the database.

type - Type of storage engine associated with the database. This is automatically added to the template schema when a user runs a manage command that creates a database schema. There is currently support for MariaDB and MySQL storage engines.

oauth_consumer_key - The OAuth consumer key. This is created for databases with objectstores automatically by the create_project manage command.

oauth_consumer_secret - The OAuth consumer secret. This is created for databases with objectstores automatically by the create_project manage command.

creation_date - Date the database was created.

cron_batch - The cron interval to use when running cron jobs on this database.

schema_hgmozilla.sql.tmpl

The hgmozilla schema currently holds the mozilla mercurial push log data. However, the only part of it that’s specific to mercurial is the web service method used to retrieve data to populate it. The data used to populate the schema is generated by the json-pushes web service method. The manage command, update_pushlog, calls this web service method and populates the associated schema. The data can be used to create an ordered list of code base changes pushed to the build/test system. This is required for any statistical method that requires a comparison between a push and its parent.

schema_objectstore.sql.tmpl

The objectstore schema holds the unprocessed json objects submitted to the project. When objects are successfully processed into a corresponding index the test_run_id field is populated with an integer. The test_run_id corresponds to the test_run.id field in the perftest schema.

schema_perftest.sql.tmpl

This perftest schema translates the JSON structure in the objectstore into a relational index. It also contains tables for the storage of statistical data generated post object submission.

Model

The model layer found in /datazilla/model provides an interface for getting/setting data in a database. The datazilla model classes rely on a module called datasource. This module encapsulates SQL manipulation. All of the SQL used by the system is stored in JSON files found in /datazilla/model/sql. There can be any number of SQL files stored in this format. The JSON structure allows SQL to be stored in named associative arrays that also contain the host type to be associated with each statement. Any command line script or web service method that requires data should use a derived model class to obtain it.

ptm = PerformanceTestModel(project)
products = ptm.get_product_test_os_map()

The ptm.get_product_test_os_map() method looks like this:

def get_product_test_os_map(self):
    proc = 'perftest.selects.get_product_test_os_map'

    product_tuple = self.sources["perftest"].dhub.execute(
        proc=proc,
        debug_show=self.DEBUG,
        return_type='tuple',
        )

    return product_tuple

perftest.selects.get_product_test_os_map found in datazilla/model/sql/perftest.json looks like this:

{
    "selects":{

       "get_product_test_os_map":{

          "sql":"SELECT b.product_id, tr.test_id, b.operating_system_id
                 FROM test_run AS tr
                 LEFT JOIN build AS b ON tr.build_id = b.id
                 WHERE b.product_id IN (
                   SELECT product_id
                   FROM product )
                GROUP BY b.product_id, tr.test_id, b.operating_system_id",

           "host":"master_host"
       },

       "...more SQL statements..."
}

The string, perftest, in perftest.selects.get_product_test_os_map refers to the SQL file name to load in /datazilla/model/sql. The SQL in perftest.json can also be written with placeholders and a string replacement system, see datasource for all of the features available.