XBRL Database

While individual XBRL filings can be processed, validated, and inspected by Arelle one-by-one, the real benefit of data in XBRL is when it can be processed across large collections of filings.  Arelle implements a number of database strategies for this and supports three approaches, (1) Abstract Model, (2) XBRL-US, and (3) DPM database architectures.

Database backups of SEC filings (2012 to current) are available for Postgres on the downloads page, this is handy for one to experiment on their own laptop or server, without going through the loading process on their own.

Database bulk loading is provided by the xbrlDB.py plugin.  Databases can be loaded in batch by an RSS feed file, by individual XBRL entry points or by scripted command files.  Use of the RSS feed file provides additional filing information (such as SEC filing references and SIC code of company).

Currently three general approaches are implemented, the first two evolved from that presented in the KU 2013 XBRL Conference paper:

  • XBRL Abstract Model Database. The XBRL Abstract Model represents an effort to step above the issues of syntax, specific models and taxonomies, and software strategy. It tries to bridge the modeling semantics, financial ontology, data points modeling, and other XBRL communities, with a view to fitting into big-iron databases and OLAP/BI systems. The Abstract Model is based on OMG’s CWM, of which OLAP is a subset, but the dimensions model in XBRL calls for a dynamic form of OLAP that hasn’t been implemented by relational databases.  Specified is a meta-model, it defines the framework for constructing an instance model that can be implemented in an actual database. In this implementation, we chose to make an instance model closely track the organization of the meta model’s data dictionary and instance features, but did not complete building out an instance model of the dimensional validation or data typing features.(The abstract model implementation has not yet provided the view schema, initially based on a prior version of the Table Linkbase.  It will be implemented basing on the models of the current Table Linkbase.)  The SQL schema for the abstract model is (for now) denormalized from that of the XBRL-US schema, so that qnames are clark-notation strings (instead of separate tables for namespaces and qnames), but this is an experiement in query convenience and could be changed based on feedback.The abstract model, when stored into the SQL databases, also provides words mining from 10-Q and 10-K documents (e.g., the non-tagged MD&A section, using a document taxonomy) and form type UPLOAD comment letters of filed documents.
  • XBRL-US Public Database. This represents an thorough and representation of XBRL, at a syntactical level, for US filings. The XML artifacts of XBRL are captured in a highly normalized set of tables that are sufficient for retention of present SEC filings. It isn’t used to handle non-US prudential filings, corporate action filings, or massively huge filings. Its highly normalized tables provide good access to retrieve and reconstruct single filings but need query support for cross-filing semantics (where data is bound to linkbase graphs of different structures). It’s currently useed by several Arelle users as a basis for mining or ETL of data.
  • DPM Database. This database implements version 7 of the DPM database evolved from the EBA project and planned for EIOPA’s tool sets.  The architecture is closely aligned to use of the XBRL Table Linkbase (not suited for current SEC Filings).

Database products supported

(1) The Abstract Model, described in the above-referenced KU paper, is currently supported by these database interfaces:

(2) The XBRL-US Public Database is currently supported by a Postgres interface. Schema DiagramSchema DDL.

(3) The DPM Database is currently supported by the SQLite interface.

SQL Schema Initialization

When xbrlDB SQL interfaces encounter a database with none of the DDL tables, they attempt to upload the corresponding DDL, but this requires proper permissions and can be problematical.  We’ve found it easier to initialize the SQL databases manually through their usual command interfaces.  The DDL files are linked above, and included in the installation under plugin/xbrlDB.  Examples assume the database has been created and named “test_db”.
  • Postgres. One may need to sudo to the postgres user id to manage tables.   The XBRL-US and Abstract Model schemas are not compatible and should be in separate databases if both are used.
    sudo -u postgres psql -d test_db
    (for XBRL-US public database)
    \i xbrlPublicPostgresDB.ddl
    (or alternatively, for abstract model database)
    \i xbrlSemanticPostgresDB.ddl
    \q
  • MySQL.  One may need to sudo to MySql root id to manage tables.
    sudo mysql --user=root --password=xxxxx test_db
    source xbrlSemanticMySqlDB.ddl
    \q
  • SQLite.  There are a number of utilities for SQLite. One alternative to load the database tables is to paste the portion of the file above standard industry code insertions (the CREATE TABLEs, etc) into an SQL editor for this product, such as SQLiteStudio.
  • Microsoft SQL Server.
    Start button -> Programs -> Microsoft SQL Server Management Studio (2012+)
    Connect dialog
    Expand tree view -> Databases -> select test_db
    File menu -> Open -> xbrlSemanticMSSql.sql
    Toolbar->Execute button or press F5
  • Oracle 11g+
    Command prompt window (e.g., C:> dialog box)

    sqlplus
    enter user and password
    @xbrlSemanticOracle.sql
    exit

xbrlDB plug-in setup

Using GUI:
Help menu ->   Manage plug-ins -> Dialog.  Press “Locally” on left under “find plug-in modules” and select directory plugin, directory xbrlDB and click __init__.py.
For a single taxonomy or entry point: open the DTS or instance.  Validate if you wish
Tools -> Store to XBRL DB (parameters same as below for command line bulleted list).
Using command line interface:
For a single taxonomy, entry point, or RSS feed file:
arelleCmdLine -f file -v
  --plugins xbrlDB
  --store-to-XBRL-DB
  "host,port,user,password,
   database,timeout,technology"
where:
  • file is the RSS feed file (local or web), or instance or DTS entry point file.  (Taxonomy package mapping will be honored, if in effect.)
  • host is for Postgres, MySql, Rexster, and RDF server the TCP/IP host (such as foo.com, or localhost), for MS SQL the service (such as herm\SQLEXPRESS), for Oracle the service (such as orcl).  For RDF to be stored in a file, use rdfTurtleFile or rdfXmlFile. For JSON to be stored in a file, use jsonFile. (File names go in the database field.)  For SQLite leave this field empty.
  • port is for Postgres and MySql the port number if nonstandard (such as 8084, an integer), and omitted for SQLite, MS SQL and Oracle
  • database is the SQL database (all SQL interfaces), the Rexster database, or the Nanodata subdirectory.  For SQLite, rdf or json file saving, this field is a full path file name on local file system.
  • timeout (when provided) overrides the default interface timeout (which may be too short or too long).
  • technology code descriptions:
    technology description
    pgSemantic Abstract Model SQL (Postgres)
    mssqlSemantic Abstract Model SQL (MSSQL)
    mysqlSemantic Abstract Model SQL (MySQL)
    sqliteSemantic Abstract Model SQL (SQLite)
    orclSemantic Abstract Model SQL (Oracle)
    rexster Abstract Model Rexter (Titan, Cassandra)
    rdfDB Abstract Model RDF (Turtle, NanoSparqlServer)
    json Abstract Model JSON (Files, MongoDB)
    postgres XBRL-US Public SQL (Postgres)
    sqliteDpmDB DPM SQL (SQLite)
Examples:
Store into postgres database using XBRL-US schema:
arelleCmdLine -f c:\temp\test.rss -v
  --disclosureSystem efm-pragmatic-all-years
  --plugins xbrlDB
  --store-to-XBRL-DB
  "myhost.com,8084,pgUserId,pgPassword,test_db,90,postgres"
Store into local MSSQL database (2012+) using SQLEXPRESS instance:
arelleCmdLine -f c:\temp\test.rss -v
  --disclosureSystem efm-pragmatic-all-years
  --plugins xbrlDB 
  --store-to-XBRL-DB
  "localhost\SQLEXPRESS,,sqlLogin,sqlPassword,,90,mssqlSemantic"

Preloaded Database

Here is a preloaded database of SEC filings, as a Postgres pg_dump gzip’ed file. Install pgAdmin III, so that you may down load this file and begin with a full set of SEC filings, word-mined primary documents, and comment letters.
click for download instructions
(DB size 417GB as of 2014-10, does not include UPLOAD comment correspondence)
 

72 Responses to XBRL Database

  1. Jason says:

    I am using the XBRL-DB plugin, and I have run into an issue. It seems like the period is not getting stored correctly based on the context for each data point.

    For example, when I query the data_point table and get ‘cash and equivalents’ for a single filing, I will probably get 3 rows. The values will be different (of course) and the context_xml_id will be something like ‘context_as_of_Dec2011, ‘context_as_of_Dec2012, ‘context_as_of_Dec2013’. I would expect the period_id to then point to the appropriate period, but period_id is null.

    I am trying to read through XbrlSemanticSqlDB.py, but unfortunately, I never learned Python so I’m not making much progress.

    Any help would be appreciated. I may just be missing a setting somewhere.

    • admin says:

      thank you for reporting!!! bug fixed in gitHub, https://arelle.atlassian.net/browse/ARELLE-318, will be in next build, postgres preloaded database will be replaced

      • Jason says:

        I installed the update, but it still doesn’t seem to be assigning periods to all contexts. The ones that are missing are similar to the following:

        0001037038

        2014-03-29

        I am using the Windows-64 version with MS SQL Server.

      • Jason says:

        So I did some more research and confirmed the context exists and the dates are getting correctly loaded in the period table, BUT the period_id is not being stored in the data_point table.

        Unfortunately I am not an expert in Python any further assistance would be appreciated.

        Could it have something to do with trying to find the record in the period table, but since the start date is null the null comparison isn’t working right?

        • Marinos says:

          Jason please tell me the command you use to query the data_table and get cash and cash equivalents

          • Jason says:

            Here is a basic SQL statement that will get you cash and cash equivalents. NOTE: This is not guaranteed to work for all companies, all the time, but it should be a good place to start.


            select dp.datapoint_id
            , a.name
            , p.end_date
            , dp.effective_value
            from data_point dp
            join aspect a
            on dp.aspect_id = a.aspect_id
            join period p
            on dp.period_id = p.period_id
            where a.qname = '{http://fasb.org/us-gaap/2013-01-31}CashAndCashEquivalentsAtCarryingValue'

            Is this what you were after?

  2. Bill Shao says:

    Does XbrlDB support Windows authentication? What component is required on the client side? We have a problem connecting a Microsoft SQL Server.

    • admin says:

      we have been able to use this kind of parameter: –store-to-XBRL-DB “localhost\SQLEXPRESS,,userID,password,test_db,90,mssqlSemantic”, might need python-MSSQL experienced advice to get it working with Windows authentication

    • Jason says:

      You could theoretically just alter the SqlDb.py file by removing the UID and PWD and adding Trusted_Connection=yes. You would need to test it, but conceptually that should work to enable windows authentication instead of requiring a username/password.

      You could still pass in a username/password, but they would be ignored.

      • admin says:

        I tried this and id didn’t work; please let support@arelle.org, or a JIRA ticket to arelle.atlassian.net, know if you find a working solution.

        • Jason says:

          I was able to get it to work with windows authentication. Change line 178 in SqlDb.py to the following. Note the removal of the UID and Password elements.


          self.conn = mssqlConnect('DRIVER={{SQL Server Native Client 11.0}};SERVER={0};DATABASE={1};Trusted_Connection=Yes;CHARSET=UTF8'
          .format(host, # e.g., localhost\\SQLEXPRESS
          database))

          You can leave the username and passwords blank when using Arelle, and it will authenticate using Windows.

          NOTE: Arelle creates cached copies of the .py files so I had to unload the plugin, exit Arelle, and then delete the cache for it to pick up my change.

  3. Fred says:

    I am getting this error when I try to ‘Store to XBRL DB’ using a Postgres abstract model sql. Any ideas why?

    Loading XBRL DB: UnicodeEncodeError: ‘ascii’ codec can’t encode character ‘\u2019′ in position 61202: ordinal not in range(128)

    File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\__init__.py”, line 109, in backgroundStoreIntoDB\n product=product)\n’, ‘
    File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 67, in insertIntoDB\n xbrlDbConn.insertXbrl(rssItem=rssItem)\n’, ‘
    File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 155, in insertXbrl\n self.insertResources()\n’, ‘
    File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 709, in insertResources\n checkIfExisting=True)\n’, ‘
    File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\SqlDb.py”, line 857, in getTable\n result = self.execute(sqlStmt,commit=commit, close=False, fetch=fetch, params=params)\n’, ‘
    File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\SqlDb.py”, line 320, in execute\n cursor.execute(sql)\n’, ‘
    File “c:\\python33\\lib\\site-packages\\pg8000\\dbapi.py”, line 243, in _fn\n’, ‘
    File “c:\\python33\\lib\\site-packages\\pg8000\\dbapi.py”, line 312, in execute\n’, ‘
    File “c:\\python33\\lib\\site-packages\\pg8000\\dbapi.py”, line 317, in _execute\n’, ‘
    File “c:\\python33\\lib\\site-packages\\pg8000\\interface.py”, line 300, in execute\n’, ‘
    File “c:\\python33\\lib\\site-packages\\pg8000\\interface.py”, line 105, in __init__\n’, ‘
    File “c:\\python33\\lib\\site-packages\\pg8000\\protocol.py”, line 938, in _fn\n’, ‘
    File “c:\\python33\\lib\\site-packages\\pg8000\\protocol.py”, line 1075, in parse\n’, ‘
    File “c:\\python33\\lib\\site-packages\\pg8000\\protocol.py”, line 993, in _send\n’, ‘
    File “c:\\python33\\lib\\site-packages\\pg8000\\protocol.py”, line 121, in serialize\n’]

  4. Bill Shao says:

    The xbrlSemanticMSSqlDB.sql has many errors. After fixing the errors and creating all tables, the latest Arelle version (06-14) doesn’t work with this MS SQL schema.

    The error message is:
    [] [Exception] Failed to complete request:
    ‘NoneType’ object has no attribute ‘get’
    [‘ File “Z:\\Documents\\mvsl\\projects\\Arelle\\ArelleProject\\src\\arelle\\Cnt
    lrCmdLine.py”, line 830, in run\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\
    xbrlDB\\__init__.py”, line 205, in xbrlDBCommandLineXbrlRun\n storeIntoDB(dbC
    onnection, modelXbrl)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\__
    init__.py”, line 166, in storeIntoDB\n result = insertIntoDB(modelXbrl, host=
    host, port=port, user=user, password=password, database=db, timeout=timeout, pro
    duct=product, rssItem=rssItem, **kwargs)\n’, ‘ File “C:\\Program Files\\Arelle\
    \plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 68, in insertIntoDB\n xbrlDbConn
    .insertXbrl(rssItem=rssItem)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbr
    lDB\\XbrlSemanticSqlDB.py”, line 145, in insertXbrl\n self.insertFiling(rssIt
    em)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py
    “, line 268, in insertFiling\n rssItemGet(“fileNumber”) or entityInfo.get(“fi
    le-number”) or str(int(time.time())),\n’] –

  5. Adam says:

    I set up a new postgres db using the schema provided.
    I’m having trouble with the Store to xbrl DB option.
    Any help would be appreciated.

    I get the following error:

    [xpDB:exception] Loading XBRL DB: AttributeError: ‘NoneType’ object has no attribute ‘get’
    [‘
    File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\__init__.py”, line 109, in backgroundStoreIntoDB\n
    product=product)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 68, in insertIntoDB\n
    xbrlDbConn.insertXbrl(rssItem=rssItem)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 145, in insertXbrl\n
    self.insertFiling(rssItem)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 268, in insertFiling\n
    rssItemGet(“fileNumber”) or entityInfo.get(“file-number”) or str(int(time.time())),\n’]

  6. Adam says:

    Thank you admin, that new build gets passed that error, but now it hits a different error:

    [xpDB:exception] Loading XBRL DB: ProgrammingError: (b’ERROR’, b’23502′, b’null value in column “accepted_timestamp” violates not-null constraint’)
    [‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\__init__.py”, line 109, in backgroundStoreIntoDB\n product=product)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 68, in insertIntoDB\n xbrlDbConn.insertXbrl(rssItem=rssItem)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 145, in insertXbrl\n self.insertFiling(rssItem)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 327, in insertFiling\n returnExistenceStatus=True)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\SqlDb.py”, line 849, in getTable\n result = self.execute(sqlStmt,commit=commit, close=False, fetch=fetch, params=params)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\SqlDb.py”, line 316, in execute\n cursor.execute(sql)\n’, ‘ File “c:\\python33\\lib\\site-packages\\pg8000\\core.py”, line 487, in execute\n’, ‘ File “c:\\python33\\lib\\site-packages\\pg8000\\core.py”, line 1944, in execute\n’, ‘ File “c:\\python33\\lib\\site-packages\\pg8000\\core.py”, line 1457, in bind\n’, ‘ File “c:\\python33\\lib\\site-packages\\pg8000\\core.py”, line 1553, in handle_messages\n’, ‘ File “c:\\python33\\lib\\site-packages\\pg8000\\core.py”, line 1542, in handle_messages\n’, ‘ File “c:\\python33\\lib\\site-packages\\pg8000\\core.py”, line 1095, in handle_ERROR_RESPONSE\n’]

  7. Adam says:

    I am able to output facts, factTable and Concepts files, when run in GUI, commandLine or web service modes, hence I don’t think the problem is with my input xbrl file.

  8. Sjoerd van Steenkiste says:

    The xbrlSemanticMySqlDB.ddl contains multiple errors when trying to load it into a MySQL schema.

    Line 51: Primary Key (entity_id) is missing
    Line 364: PRIMARY KEY (entity_id) should be PRIMARY KEY (entity_identifier_id)
    Line 454: index02 should be index03

  9. Sjoerd van Steenkiste says:

    Furthermore I get an error when trying to load data into the database of the following type:

    Table ‘x’ was not locked with lock tables

    Any suggestions on resolving this issue?

    • admin says:

      you may also report to support@arelle.org or an issue via arelle.atlassian.net, please suggest which tables are ‘x’ above that also need to be locked for MySql (for the other database types only a few key tables are locked, the rest haven’t needed locking)

  10. Faran says:

    Is there any way to make join to role_type table to know what form type a particular document is for? I need to find the classification between cash flow, balance sheet, etc when I load a 10-K filing.

    Thanks.

    • admin says:

      probably need to join with data_points that match table_data_points with codes ‘BS’, ‘CF’, etc, and join on report_id too

  11. Kevin says:

    I have been trying to download the sec database via your link, but unable to do so as it shuts down after 200mb, is there any other links or torrent file? 14Gb is a big file but it shouldn’t stop after a few minutes. Any guidance much appreciated

    • admin says:

      sorry, no support of torrents. server is on GoDaddy, they should have good capacity. what geographic area are you downloading to (we may be able to find a more local server?)

  12. Gopi says:

    Hi admin, I am trying to load filing from Arelle 1.0.0 to SQL 2012 server which is created using the schema provided in this page. It throws the following error …
    [xpDB:exception] Loading XBRL DB: AttributeError: ‘XbrlSqlDatabaseConnection’ object has no attribute ‘entityId’
    [‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\__init__.py”, line 109, in backgroundStoreIntoDB\n product=product)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 69, in insertIntoDB\n xbrlDbConn.insertXbrl(rssItem=rssItem)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 146, in insertXbrl\n self.insertFiling(rssItem)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 318, in insertFiling\n self.entityId,\n’]

    Do you have any insights about this error?

    Thanks.

    • admin says:

      Here’s a test build to try to fix this problem (requires reloading the sql schema definitions)

      http://arelle.org/files/arelle-win-x64-2014-08-22.exe

      • Jeremy Everett says:

        I am getting the same problem trying to load a filing to the XBRL DB on an Ubuntu 14.04 LTS server.

        [xpDB:exception] Loading XBRL DB: AttributeError: ‘XbrlSqlDatabaseConnection’ object has no attribute ‘entityId’
        [‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\__init__.py”, line 109, in backgroundStoreIntoDB\n product=product)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 69, in insertIntoDB\n xbrlDbConn.insertXbrl(rssItem=rssItem)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 146, in insertXbrl\n self.insertFiling(rssItem)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 318, in insertFiling\n self.entityId,\n’]

  13. Srinivas says:

    I am getting the below error when I Tried to use “Store to XBRL DB”

    [xpDB:exception] Loading XBRL DB: ProgrammingError: (‘42000’, “[42000] [Microsoft][SQL Server Native Client 11.0][SQL Server]Unknown object type ‘TEMPORARY’ used in a CREATE, DROP, or ALTER statement. (343) (SQLExecDirectW)”)

    [‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\__init__.py”, line 109, in backgroundStoreIntoDB\n product=product)\n’,
    ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 69, in insertIntoDB\n xbrlDbConn.insertXbrl(rssItem=rssItem)\n’, ‘
    File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 144, in insertXbrl\n self.dropTemporaryTable()\n’, ‘ File
    “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\SqlDb.py”, line 286, in dropTemporaryTable\n close=True, commit=False, fetch=False, action=”dropping temporary table”)\n’,
    ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\SqlDb.py”, line 321, in execute\n cursor.execute(sql)\n’]

    Please help me with this?

  14. Chase says:

    When I try to load the DDL file into the MySQL schema, I get the following error. Any guidance?
    21:58:57 CREATE INDEX document_index02 USING btree ON document (document_url(512)) Error Code: 1071. Specified key was too long; max key length is 767 bytes 0.000 sec

  15. Hector says:

    the picture “http://arelle.org/wordpress/wp-content/uploads/2014/07/sql_diagram.png” incomplete, please can you post it again, Thanks

    • admin says:

      The picture has been refreshed. I hope it is better now. A few users of the database have made temporary tables (usually starting with “_”) or materialized view tables (some starting with “mv*”), usually to make some multi-step retrieval work better, those may be left in the distribution.

  16. Asker says:

    Is there a simple tutorial (how to set up mysql or postgres, first steps on connecting to them) – on how to get started with the locally installed ZIP SEC data? I am looking to set up a mac machine with the DB and arelle and use python to extract financial parameters. Example query “find me all companies that have market cap >$1B as of 2014”

    Thank you

  17. Mitch says:

    Couple Questions:
    1. Has anyone written any sql views to bring the schema together for income statement, balance sheet and cash flow?
    2. How do I automate the loading of new data into the schema as new reports are filed to the SEC.
    Thanks.

  18. Mitch says:

    A few other questions:

    1. Assuming there is a way to automate loading the SEC data… does it check to make sure the data wasn’t already loaded? or will it simply overwrite the existing values?

  19. john drinane says:

    Arelle Contributers,

    I have downloaded the example database (wget -O xbrldb_SEC.pg.gzip http://arelle.org/downloads/17) which has filings until August 2014. I believe this is an example of the so called “Abstract Model”.

    The model of the downloadable database is not explicitly stated during download, so is my understanding correct?

    Also if it is correct, according to the schema (http://arelle.org/wordpress/wp-content/uploads/2014/07/sql_diagram.png) for the Abstract Model it appears to be missing two tables:
    1. _filing_relationship_sets
    2. _relationship_sets_aspects

    and contains an extra empty table called staging_cik_coleft_c, which I assume is somehow utilized in a loading/updating process.

    Can you comment on the significance of the discrepancies or point to existing resources/documentation that could give further context on understanding the downloadable postgresql database.

    My application:
    I’m looking to perform research on the EDGAR data and your tool seems like a perfect tool to use so I’m trying to understand your process.

    Cheers!

    John Drinane

    • admin says:

      the diagram has been updated. the files starting in “_” are temporary files that some users have added to do multi-query work. I guess the file with “staging” is also from one of the users. Please excuse the extra files.

      We do have this file up to the end of the most recent file, send a private e-mail to support@arelle.org and we’ll give you a direct link.

      • john drinane says:

        Dear Admin,

        Thanks for the help prior. I am now trying to update the postgres XBRL database and was attempting to run the command line argument you list above, but it is not working at the moment for me and instead is issuing an exception of type below. Are you able to get this command below to work? I am mainly interested to know if I am calling the command inappropriately or if the specific RSS feed is the problem.

        THANKS!

        Example RSS Feed:https://www.sec.gov/Archives/edgar/monthly/xbrlrss-2015-09.xml

        Command:
        ./arelleCmdLine -f https://www.sec.gov/Archives/edgar/monthly/xbrlrss-2015-09.xml -v –disclosureSystem efm-pragmatic-all-years –store-to-XBRL-DB “127.0.0.1,port,pguser,pgpass,test_edgar,90,postgres”

        My exception:
        [info] loaded in 0.69 secs at 2015-09-20T21:58:04 – https://www.sec.gov/Archives/edgar/monthly/xbrlrss-2015-09.xml
        [info] RSS Feed – xbrlrss-2015-09.xml
        [info] RSS Item 0001357615-15-000202 10-K/A KBR, INC. 2014-12-31 – xbrlrss-2015-09.xml 12
        [exception] RSS item validation exception: Can’t convert ‘int’ object to str implicitly, instance: http://www.sec.gov/Archives/edgar/data/1357615/000135761515000202/0001357615-15-000202-xbrl.zip/kbr-20141231.xmlhttp://www.sec.gov/Archives/edgar/data/1357615/000135761515000202/0001357615-15-000202-xbrl.zip/kbr-20141231.xml , xbrlrss-2015-09.xml
        Traceback (most recent call last):
        File “/home/arelle/src/arelle/Validate.py”, line 126, in validateRssFeed
        File “/home/conan/Documents/EDGAR/ARELLE v 2014-12-31/plugin/xbrlDB/__init__.py”, line 211, in xbrlDBvalidateRssItem
        storeIntoDB(val.modelXbrl.xbrlDBconnection, modelXbrl, rssItem)
        File “/home/conan/Documents/EDGAR/ARELLE v 2014-12-31/plugin/xbrlDB/__init__.py”, line 166, in storeIntoDB
        result = insertIntoDB(modelXbrl, host=host, port=port, user=user, password=password, database=db, timeout=timeout, product=product, rssItem=rssItem, **kwargs)
        File “/home/conan/Documents/EDGAR/ARELLE v 2014-12-31/plugin/xbrlDB/XbrlPublicPostgresDB.py”, line 57, in insertIntoDB
        xpgdb = XbrlPostgresDatabaseConnection(modelXbrl, user, password, host, port, database, timeout, product)
        File “/home/conan/Documents/EDGAR/ARELLE v 2014-12-31/plugin/xbrlDB/SqlDb.py”, line 152, in __init__
        socket_timeout=timeout or 60)
        File “/usr/local/lib/python3.4/dist-packages/pg8000/__init__.py”, line 148, in connect
        File “/usr/local/lib/python3.4/dist-packages/pg8000/core.py”, line 1157, in __init__
        File “/usr/local/lib/python3.4/dist-packages/pg8000/core.py”, line 1152, in __init__
        File “/usr/local/lib/python3.4/dist-packages/pg8000/core.py”, line 1619, in handle_messages
        File “/usr/local/lib/python3.4/dist-packages/pg8000/core.py”, line 1365, in handle_AUTHENTICATION_REQUEST
        TypeError: Can’t convert ‘int’ object to str implicitly

  20. polyjian says:

    I am using the latest build on 2015/02/01, I am storing into MS sql server. I also have the problem which Jason reported above, basically, for a record in [data_point] table, the period_id column is NULL.

    Any issue I observed is that the end_date in the [period] is always T+1 of what reported in the xml raw file. Any reason for doing this?

    Any idea how to fix this? Unfortunately, I am not a python export and it is difficult for me to pinpoint the source code which caused this.

    • admin says:

      we’ll look into the empty period_id bug. The time is a date-time, so ending and instant time is the 24:00 hours of the day, which in SQL is midnight of the following day. This is for compatibility with the XBRL-US public database.

  21. polyjian says:

    I think I found the source of the bug, which is in the Sqldb.py getTable method, when a period is inserted with null start date, that row will NOT be returned (for MS Sql Server at least). Therefore, causing the period_id not found.

  22. polyjian says:

    continue above post, I don’t how to fix the bug though:-(

  23. polyjian says:

    I think I found the fix. In method SqlDb.py: getTable method, change
    “match”: ‘ AND ‘.join(‘{0}.{2} = #{1}.{2}’

    to
    “match”: ‘ AND ‘.join(‘({0}.{2} = #{1}.{2} or ({0}.{2} is null and #{1}.{2} is null))’

    • admin says:

      Thank you, please e-mail the updated module to support at arelle.org, I think that code construct is in two places in the MS sql code path.

  24. klaus says:

    When I run the command (under Mac 10.10.2): /Applications/Arelle.app/contents/MacOS/arelleCmdLine -f /Applications/MAMP/htdocs/bisbase.dk/tmp/adapt2013.xml –store-to-XBRL-DB “localhost,8889,root,root,xbrl_test,,mysqlSemantic” it fails with:

    (2006, “MySQL server has gone away (BrokenPipeError(32, ‘Broken pipe’))”)
    [‘ File “/Users/hermf/Documents/mvsl/projects/Arelle/ArelleProject/src/arelle/CntlrCmdLine.py”, line 858, in run\n’, ‘ File “/Applications/Arelle.app/Contents/MacOS/plugin/xbrlDB/__init__.py”, line 207, in xbrlDBCommandLineXbrlRun\n storeIntoDB(dbConnection, modelXbrl)\n’, ‘ File “/Applications/Arelle.app/Contents/MacOS/plugin/xbrlDB/__init__.py”, line 166, in storeIntoDB\n result = insertIntoDB(modelXbrl, host=host, port=port, user=user, password=password, database=db, timeout=timeout, product=product, rssItem=rssItem, **kwargs)\n’, ‘ File “/Applications/Arelle.app/Contents/MacOS/plugin/xbrlDB/XbrlSemanticSqlDB.py”, line 69, in insertIntoDB\n xbrlDbConn.insertXbrl(rssItem=rssItem)\n’, ‘ File “/Applications/Arelle.app/Contents/MacOS/plugin/xbrlDB/XbrlSemanticSqlDB.py”, line 160, in insertXbrl\n self.insertResources()\n’, ‘ File “/Applications/Arelle.app/Contents/MacOS/plugin/xbrlDB/XbrlSemanticSqlDB.py”, line 798, in insertResources\n checkIfExisting=True)\n’, ‘ File “/Applications/Arelle.app/Contents/MacOS/plugin/xbrlDB/SqlDb.py”, line 862, in getTable\n result = self.execute(sqlStmt,commit=commit, close=False, fetch=fetch, params=params)\n’, ‘ File “/Applications/Arelle.app/Contents/MacOS/plugin/xbrlDB/SqlDb.py”, line 321, in execute\n cursor.execute(sql)\n’, ‘ File “/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/PyMySQL-0.6.3-py3.4.egg/pymysql/cursors.py”, line 135, in execute\n’, ‘ File “/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/PyMySQL-0.6.3-py3.4.egg/pymysql/cursors.py”, line 274, in _query\n’, ‘ File “/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/PyMySQL-0.6.3-py3.4.egg/pymysql/connections.py”, line 713, in query\n’, ‘ File “/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/PyMySQL-0.6.3-py3.4.egg/pymysql/connections.py”, line 892, in _execute_command\n’, ‘ File “/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/PyMySQL-0.6.3-py3.4.egg/pymysql/connections.py”, line 852, in _write_bytes\n’] – /Applications/MAMP/htdocs/bisbase.dk/tmp/adapt2013.xml

    If I run the exact same command again right after, I get the message “[info] loaded in xx secs at …” and the data seems to be loaded.
    (the error can be re-produced by first running the xbrlSemanticMySqlDB.ddl and then the arelleCmdLine-command)

  25. Oscar R says:

    Por favor alguien en Español que me pueda colaborar. muchas gracias

    Mi nombre es Oscar, vivo en Colombia.
    En mi país una entidad de control requiere que le presente los informes en XBRL, ellos me entregaron la taxonomía
    https://www.superfinanciera.gov.co/xbrl/ifrs/2015-01-01/ctrl-103-agr-ind-int_entry-point_2015-01-01.xsd

    Requiero generar los informes, pero soy novato en el tema y requiero su colaboración.

    Tengo instalado el Arelle, puedo abrir la taxonomía.
    Requiero.
    Llenar los campos necesarios para el informe
    Generar los archivos XBRL.

    Por favor me puedes ayudar.

  26. Nate says:

    I’m trying to download the “Preloaded Database
    Here is a preloaded database of SEC filings, as a Postgres pg_dump gzip’ed file. ” . I am using Windows 7 with a git bash and wget.exe installed into the git bin. I downloaded on Sunday successfully, but failed to gunzip. At first, gunzip complained because gunzip expects “.gz” extension. I did a “mv” to rename the file from “…gzip” to “…gz” . (I may have screwed it up there)

    Then when trying to gunzip I got this message: gunzip: invalid compressed data–format violated

    Can I get a checksum to make sure I downloaded properly? Any suggestions?

  27. Tania says:

    I have a problem with Arelle: it don’t work with a big table. You can see it here.
    Is it problem with memory of computer or anything else? System: windows 7, 64 bit, 8Gb

  28. Sebastian says:

    Hi!

    First of all, I really appreciate this project and it helped me a lot! Thanks!

    Is this possible to allow Arelle to import several filings into database concurrently? (I’m using Postgres and xbrl_public schema.)

    I found two issues with my goal:
    1) Unique key violations – due to relatively complicated (and long) process of inserting whole filing, concurrent insertions gets the same values from sequence and thus unique key violations. However I solved it by modifying Arelle insertXbrl function – I used pg_advisory locks which turned insertion code into critical section with database lock.
    2) Additional file download problems – I didn’t dig too deep into this, but when several Arelle processes try to download extra files they fails with some renaming errors due to current file not found. This effectively messes cache completely and filing validation gives tons of errors. I suspect there is some fixed name for resource currently being downloaded and after successful download it gets renamed to it’s proper name in cache. When several processes attempt to download different files into single file – well it can’t succeed. Is it hard to fix this?

    • Sebastian says:

      Ok I think I found workaround for second problem. Instead of creating temp file names by adding ‘.tmp’, use really temporary file created by tempfile.mkstemp().

      It seems that it works concurrently now.

      However are there any other issues thaty I might now be aware of?

  29. Jeremy says:

    Could you also tell what the settings other than the IP address would be for the XBRL DB (Python) on a Linux OS?

  30. Jeremy Everett says:

    I am getting the same problem trying to load a filing to the XBRL DB on an Ubuntu 14.04 LTS server with the latest Arelle build on this website (downloaded Saturday, Jan 16, 2016).

    [xpDB:exception] Loading XBRL DB: AttributeError: ‘XbrlSqlDatabaseConnection’ object has no attribute ‘entityId’
    [‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\__init__.py”, line 109, in backgroundStoreIntoDB\n product=product)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 69, in insertIntoDB\n xbrlDbConn.insertXbrl(rssItem=rssItem)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 146, in insertXbrl\n self.insertFiling(rssItem)\n’, ‘ File “C:\\Program Files\\Arelle\\plugin\\xbrlDB\\XbrlSemanticSqlDB.py”, line 318, in insertFiling\n self.entityId,\n’]

  31. Tania says:

    Hello!
    I always have a problem like this: http://clip2net.com/s/3tLQIGq
    What do I don’t correctly?)
    Thank you!

  32. Jason says:

    Is it possible to get a Microsoft SQL Server version of the XBRL-US Public Database? What needs to be done? Does the DDL just need to be converted from the Postgres version?

  33. AMH007 says:

    Fixed. Shouldn’t have used the ‘+’ sign.

  34. AMH007 says:

    Hi Guys,
    Did anyone manage to create views for financial statements? I know there are bits and pieces everywhere to get this info but I wasn’t able to build it

  35. Jeremy says:

    Is the database file we can download a plain text SQL dump from postgresql? If so, can you change your code to include what is necessary for it to restore?

    From stackoverflow.com: It looks like a (gzipped) plaintext -SQL dump. There is no create database thisdb; at the start, so you should do that yourself. (as user+owner postgres). Then, you should submit the big file via zcat the_file | psql -U postgres the_db_that_you_just_created

Leave a Reply

Your email address will not be published. Required fields are marked *