arkL - login

arkL database of stratigraphic data from TimeScaleCreator - introduction

The TimeScaleCreator program is based on a uniquely extensive archive of stratigraphic data compiled by James Ogg, Felix Gradstein and numerous colleagues, as also presented in the GTS2020 volume. The arkL project aims to convert this data into an openly accessible relational database which should:

The current website is a first step toward this. It includes most chronostratigraphic, biostratigraphic and magnetostratigraphic event calibrations and interval definitions from TSC from the Cambrian to the Recent, together with a bibliography and geochemical datasets. Planned developments include, improved referencing linkage, linkage with mikrotax and other taxonomic databases, provision of an API to allow cross-linkage from other websites.

The data presented here should be identical to that provided by the current version of TimeScaleCreator (8.1), apart from minor corrections made during compilation of the database (and possibly some mistakes introduced during compilation). The mySQL database includes tables of events (13,000 entries), intervals (10,000 entries) and references (5000 entries) together with metadata tables on datasets (135 entries - e.g. Tethyan nannofossils) and TimeScaleCreator columns (500 entries). In addition various proxy curves as contained in TSC are included - e.g. isotope data, orbital cyclicity solutions.

As currently configured the site will primarily be of use to earth scientists working with geological time who need reference data. A basic use case would be to use the TimeScaleCreator program to explore the data and this site to download subsets of data. It will also be possible to use this site to make corrections to the underlying database although these corrections will not immediately be reflected in the publically available version of TSC.

The website allows viewing (and editing) of all the tables. The multicolumn view page allows diplay of the data in a similar way to that of TSC. The most immediately useful page will probably be the dataset page as this synthesises single sets of data (e.g Tethyan anmonite zones). Text files of the data can be downloaded from this page. The column view page provides data precisely corresponding to single columns on TimeScaleCreator and graphic mock-ups of the corresponding TSC columns. This page is probably easier to browse. The references page may be of use as an online bibliography of stratigraphic literature, and references can be exported in .ris format to allow easy import to reference management software such as EndNote. The other pages are primarily for editors but browsing them may be useful for anyone interested in the system. There is no API yet. This website will not replace TSC as the system for end users/production of high quality multi-column charts.

This project is funded by the TimeScaleCreator Foundation and the Digital Deep Earth Program

Status March 2026: the site is under active development, and neither the content or the software are complete or fully checked. The bulk of the content from TSC has now been entered into the database but there is a lot checking of revising needed. Please send any feedback and comments to Jeremy Young - jeremy.young@ucl.ac.uk or James Ogg

Editors' login for the arkL database

To do any editing on the arkL database you will need to login with a username and password provided by Jeremy Young. You can, however, browse the system and download data without doing this. NB The database is still under active development and is not open for dispersed editing but corrections are welcome.
Username:
Password :
  
sorry, you are not logged in :-( but you can browse the site, use the menu at top left ;-)
success

Database schema

Screenshot Figure: Outline database schema for the arkL database
Notes
  1. This schema is indicative only, some fields are omitted, and the linkage lines are not precise
  2. taxon_links will provide the basis for linking to external taxonomic databases such as mikrotax or dinoflaj, these are under development

Database tables summary

DATASETS TABLE

A dataset is a set of interrelated data - e.g. Tethyan ammonites, standard chronostratigraphy. Most datasets include both intervals and events. The dataset table does not contain any basic data but rather metadata about the whole dataset. The actual data is stored in the events and intervals tables. When a dataset is compiled there is one set of events but there may be multiple sets of intervals - e.g. a single sequence of nannofossil events may be variously used to define several alternative zonations.

EVENTS TABLE

An event is a time horizon, such as the first or last occurrence of taxon, or the base of a magnetochron or chronstratigrapihic stage. All event data is stored in one big table. In many datasets in TSC (e.g chronostrat or magnetostrat datasets) the events are not displayed in separate columns although they are defined in the database. E.g. base Cenomanian is used to define the base of the Cenomanian, the Early Cenomanian and the Late Cretaceous but it is not plotted as labelled event. In other cases the events may be plotted and may be divided into several columns - e.g. planktonic foram events are divided between columns of "marker events" and "other events".
The age of events is calculated in one of four ways, for each of which there are separate fields
  1. preset age - an externally determined Ma age, from plate spreading model, orbital cyclicity, CONOP spline or radiometric dating. Fields: preset_age, preset_age_notes.
  2. Placement within an interval - this is by far the most common. fields: pup (proportion up, 0->1), within_intv, within_intv_id
  3. Placement between two events - used when the placement does not fall within a single defined interval. Fields: pub (proportion up between), hev, hev_id, lev, lev_id (hev = higher event, lev = lower event)
  4. Age offset from an event. Fields: offset (a value in Ma +ve values imply the event is below the event it is offset from, -ve values above), offset_from_event, offset_from_event_id.

INTERVALS TABLE

An interval is period of time such as a biostratigraphic zone or a chronostratigraphic stage. All interval data is stored in one big table. A dataset may contain more than one group of intervals (e.g. different zonations based on the same set of events), but each interval only appears in one column. Alternative schemes often use intervals with same name so an added abreviation is used to make them unique - e.g

COLUMNS TABLE

The columns table has one row for each column which can be plotted in TSC. Its purpose is to provide an index of the clumns and metadata on them, including data on how they are plotted.

SUB-DATASETS TABLE

These, and sub-columns, were a previous approach to structuring the data and specifically to dealing with the problem that single datasets may contain several columns of closely related data. They are now being eliminated, in favour of placing all the relevant data in the columns table - and explicitly indicating in the intervals and events tables which column (if any) each item belongs in.

POINT-DATA

TSC also includes comparison data such as oxygen isotope curves and orbital parameters. These are stored as age-value pairs to enable curve plotting. Age-value pairs for plots can be stored in two ways:
  1. In the events table with plot value data stored in either the pointdata1 or pointdata2 columns. Storing here is good for small datasets which have ages calibrated against something else.
  2. Separate tables with an age column and with the point value stored in any named column. This is the ideal solution for datasets with lots of data, and with ages given in Ma / externaly calibrated. These tables are given arkZ names - e.g arkZ_Laskar2004 (Milankovitch solutions) and arkZ_Westerhold (Cenogrid stable isotope data). These tables supplement the main data model, so they can easily be added, or replaced with new versions. In the current model to use events from this type of data (e.g. C-isotope events, or cyclostrat cyles) these events need to be extracted and stored as event tables.
NB Sequence data curves are a special case - see separate notes.

Revised querying system - Feb 2026

The subdataset/sub-column system was getting complex and confusing so I have changed to instead using only datasets and columns with extra fields added to the events and intervals tables indicating the column_id of the column they are displayed in (if they are displayed). This should make selecting the set of data needed for a column much simpler. NB The column view page is the key one for understanding how the data for each TSC column is defined, and the mySQL queries are shown on that page. This should be particularly useful for understanding the complications noted below.

Query for intervals column

Basic query: select i.*,e.event_display from arkL_intervals as i left join arkL_events as e on i.base_id=e.id where i.column_id=".$column_id." and base_age<= ".$agewindowbase." and top_age >= ".$agewindowtop." order by base_age, top_age
Complications:

Query for events column

Basic query: select * from arkL_events where column_id=".$column_id." AND age > ".$agewindowtop." and age < ".$agewindowbase." order by age
Complications:

Naming of events and intervals

Events and intervals need names which are unique within the database. For most intervals and events this is acchieved by adding an extra abbreviation - e.g. TAZ (Tethyan Ammonite Zone) or DeL (German Lithostrat unit). Biostrat events are an exception since they are prefixed by FAD or LAD and taxon names are already unique so an added abbreviation is not needed. Likewise standard chronoostrat names (e.g Albian) are assumed to be unique and do not have an added abv. An extra field - has_added_abv - indicates if the last word of the name is an added abbreviation. For diagram plotting these added_abvs need to be removed.

Events and intervals also have unique id codes and these are what is used to make hard links, so the actual names can safely be edited after upload - e.g. the 'Stanton (and Cass) Seq' has the unique interval id 11495. So, if the name needs to be updated, it can be without losing any data connections.

Special data types

The prime data types are chronostratigraphic units and biostratigraphic zones and events. other tpyes of data need to varying degrees special treatment.

Magnetostratigraphic data & anoxic intevals

This is basically just handled as a normal intervals coloured column, but a special polarity field is used to define the colour (n -> black; r -> white; no data ->grey; uncertain -> grey). Anoxic intervals are coded the same way, since this is what is done in TSC: n (anoxia) -> black; u (dysoxia) -> grey.
Naming of units - in the Cenozoic and Late Cretaceous a threefold hierarchy of terms is applied to chrons e.g C5, C5n, C5n.1n - (see table). In the Early Cretaceous to Middle Juraasic the M-Sequence is similarly named. inboth the c-sequence and the M-sequence units are numbered from the top down. From the Early Jurassic downwards numerous separate sequences are recognised with less consistent nomenclature, typically though there are only the equivalents of full chrons and subchrons and numbering is done from the bottom up. For the database all intervals and events need unique names so sub-chrons have been numbered for the database. These terms are unpublished and should not be used in other contexts.
level example display in TSC used in equations?
sequence C-Sequence column of named intervals = series label column no
full chron C5 column of named intervals = chron label column no
chron C5n yes
sub-chron C5n.1n graphical column = polarity chrons column yes

Sequence data

See separate page.

Lithostratigraphic/facies data

For these there are additional lithology & lithilogy_id fields in the intervals table, which link to a lithology table. In the lithology-table each lithology has a colour and column width defined, these are used to draw a lithological log (NB the log is against time not bed thickness). The table also gives the TSC lithology_code, although I don't use that in my plots. For the database every bed needs a unique name (intervalx term), so arbitrary terms have been added where they are not given in the workbooks.
The query to get lithology column data is: select i.*,e.event_display,l.colour as lithcolour,l.width from arkL_intervals as i left join arkL_events as e on i.base_id=e.id left join arkL_lithology as l on i.lithology_id=l.id where i.column_id=".$column_id." and base_age<= ".$agewindowbase." and top_age >= ".$agewindowtop." order by base_age, top_age" NB Some add in packs use different sets of lithology codes (e.g Autrastralian or Norwegian), to make sure the correct ids are used it will be necessary to modify the upload code.
Typically in TSC lithostrat datsets have four columns with rather idiosyncratic titles.
  1. Lithology Graphic log - no consistent name is applied to this column but it is the first listed.
  2. Members -> names of the units in the graphic log, almost always blank
  3. Facies labels -> Formation names for sets of beds/members
  4. Series label-> Names for sets of Formations

Menu items

The menu provides links to the set of scripts/pages used for viewing and editing the system
introduction and login this page
notes on sequence data explanation of how sequence stratigraphic data is handed and plotted
dataset view page displaying for any selected dataset the metadata for that dataset, a tabular summary of the intervals and events, and a graphical plot of the columns
column view page displaying for any selected column the metadata for that column, a tabular summary of the intervals or events, and a graphical plot of it
event view page displaying for any selected event the metadata for that event
interval view page displaying for any selected interval the metadata for that interval
references searchable bibliography
taxon links page to set links to external taxonomy databases - to be developed
upload data page to upload a table of data defining a dataset
populate (integrate and check data) page to repopulate the database after uploading or editing data and to check calculated ages of vs. the ages in TSC8 (2020)
On all the view pages logged-in editors can edit the metadata for that dataset, column, interval or event