Data Mart

Data Mart provides a relational view of your book of business in Socotra. You can issue SQL queries across a broad set of tables, including policies, transactions, field values, and financials.

Getting Started with Socotra Data Mart

Prerequisites

  • Socotra-provided credentials and connection details

  • IP registration with Socotra to allow connections from your IP

  • A MySQL-compatible client, such as MySQL Workbench

Contact your Socotra representative for onboarding details.

Note

Popular analytical tools such as Tableau or Microsoft Power BI can also be used with their respective MySQL connectors.

Data Mart Schema

Data Mart schema

Data Mart Sample Reports

A few sample queries are included below.

Count of policyholders

SELECT COUNT(*) FROM policyholder;

Count of policies

SELECT COUNT(*) FROM policy;

Select all in-effect policies with monthly payment schedules

SELECT *
FROM policy
WHERE `issued_timestamp` IS NOT NULL
    AND `cancellation_timestamp` IS NULL
    AND `payment_schedule_name` LIKE 'monthly'
    AND `policy_start_timestamp` < NOW()
    AND `policy_end_timestamp` > NOW();

Select all policies issued with written premium within an interval

— Set Timestamps
SET @start_timestamp = 1654056000000; -- Saturday, January 1, 2022 12:00:00 AM EST
SET @end_timestamp = 1656648000000; -- Friday, July 1, 2022 12:00:00 AM EST
SET @as_of_timestamp = 1656648000000; -- Friday, July 1, 2022 12:00:00 AM EST


SELECT
pol.locator,
exp.name AS exp_name, exp.locator AS exp_locator,
per.name AS per_name, per.locator AS per_locator,
from_unixtime(MIN(per_c.start_timestamp/1000, '%Y-%m-%d')) AS effective_date,
from_unixtime(MAX(per_c.end_timestamp/1000, '%Y-%m-%d')) AS expiry_date,
SUM(per_c.premium) AS premium
FROM policy pol
JOIN exposure exp ON pol.locator = exp.policy_locator
JOIN peril per ON exp.locator = per.exposure_locator
JOIN peril_characteristics per_c ON per.locator = per_c.peril_locator
-- Policy is issued within the range provided
WHERE pol.issued_timestamp between @start_timestamp AND @end_timestamp
-- Keep all transactions created after the "as of" date before that date
AND per_c.created_timestamp <= @as_of_timestamp
-- Keep all transactions which have not been replaced before the "as of" date
AND (
    per_c.replaced_timestamp > @as_of_timestamp
    OR
    per_c.replaced_timestamp is null
)
GROUP BY exp_locator, exp_name, per_locator
ORDER BY pol.locator;

Summarize specific set of field values with corresponding characteristics and policy locators, across all policies

SELECT
pc.policy_locator,
pc.start_timestamp,
pc.end_timestamp,
pcf.policy_characteristics_locator,
pcf.parent_locator,
-- Basic template for each field: MAX for aggregation requirement and null filter
MAX(CASE WHEN pcf.field_name = "driver_firstname" THEN pcf.field_value END) "driver_firstname",
MAX(CASE WHEN pcf.field_name = "driver_lastname" THEN pcf.field_value END) "driver_lastname",
MAX(CASE WHEN pcf.field_name = "driver_designation" THEN pcf.field_value END) "driver_designation"
FROM policy_characteristics_fields pcf
JOIN policy_characteristics pc ON pcf.policy_characteristics_locator = pc.locator
WHERE parent_name = "drivers"
GROUP BY pcf.policy_characteristics_locator, pcf.parent_locator
ORDER BY pcf.policy_characteristics_locator, pcf.parent_locator ASC;
Summarize exposure field values, including group field values, for a particular policy
-- Note: this pattern applies for policy and perils as well.

SET @policy_locator = ‘locator-value’;

SELECT
ecf.field_name,
ecf.field_value,
ecf.parent_name,
e.name AS exp_name,
ec.locator AS exp_char_locator,
ec.start_timestamp AS exp_char_start,
ec.end_timestamp AS exp_char_end
FROM exposure_characteristics_fields ecf
JOIN exposure_characteristics ec ON ecf.exposure_characteristics_locator = ec.locator
JOIN exposure e ON ec.exposure_locator=e.locator
WHERE ec.policy_locator=@policy_locator
AND ecf.is_group=0;
Categorized count of all modifications on a policy
SET @policy_locator = ‘locator-value’;

SELECT pm.type as modification_type, COUNT(pm.type) as count
FROM policy_modification pm
WHERE pm.policy_locator = @policy_locator
GROUP BY pm.type;
Fetch grace periods for a given policy
SET @policy_locator = ‘locator-value’;

SELECT * FROM grace_period WHERE policy_locator = @policy_locator;

Summarize financial transactions on invoices for a policy

SET @policy_locator = ‘locator-value’;

SELECT tx.type, tx.amount,
tx.peril_characteristics_locator, tx.invoice_locator,
invoice.due_timestamp, invoice.settlement_status, invoice.settlement_type
FROM financial_transaction tx
JOIN invoice ON tx.invoice_locator = invoice.locator
WHERE tx.policy_locator = @policy_locator;

Data Mart Frequently Asked Questions (FAQ)

Can I connect with Tableau, Microsoft Power BI, or other such analytical tools?

Yes, you can connect with any MySQL/ODBC-compatible client. Both Tableau and Microsoft Power BI publish instructions for connections to MySQL:


How can I use datetime values in Data Mart?

Data Mart datetimes are stored as MySQL bigints, housing 13-digit Unix timestamps. You can use the standard MySQL date and time functions to work with these values; for example, to display selected policy issue dates in YYYY-MM-DD format, you can use from_unixtime:

SELECT from_unixtime(p.policy_start_timestamp / 1000, '%Y-%m-%d') FROM policy p;

Note

Depending on your use case, you may be able to rely on the MySQL implicit conversions between integral timestamps and the MySQL DATE, TIME, TIMESTAMP, and DATETIME types. Refer to the following MySQL documentation for details.


What is the relationship between the Data Mart schema and entity representations in the Socotra API?

While we aim for consistency in data representations, you should expect to see variations between these two systems that reflect different purposes. In comparing Data Mart’s policy and the API’s PolicyResponse object, for example, you can see that they expose many of the same attributes – locator, product name, timestamps, etc. – but that the naming conventions differ slightly. Two significant commonalities between the API and Data Mart include (1) consistent presence of locator identifiers for referential relationships; and (2) a relatively small set of data types for the full range of fields. Not all Data Mart tables will have a corresponding analogue in the API data model, and some tables will have fields not seen in a related API representation, often introduced to facilitate simplified SQL queries against the Data Mart schema.


How do group field entries differ from regular field entries?

Group fields, group field child entries, and non-group fields all coexist in a single policy_, exposure_, or peril_characteristics_fields table.

  • If an entry is a group field, the is_group value will be 1 (true) to indicate that it represents a top-level group field. The field_value will be a special locator value. This locator value will match the parent_locator for all child values for that entry.

  • If an entry is a child of a group field, the parent_locator will have a value that matches the field_value of the corresponding group field entry. The child’s is_group value will be 0 (false) since it is not itself a top-level group field value.

  • If an entry is neither a group field nor a child of a group field, parent_locator will be null, and is_group will be 0 (false).


How do field values relate to characteristics?

Consistent with the Socotra data model, a set of field values is associated with a specific characteristics object for the policy, exposure, or peril. The _fields tables provide a corresponding _characteristics_locator to draw a link to the corresponding characteristics, which can then be used to link back to higher-level entities (i.e. policy, exposure, peril, policyholder). Because a set of field values is linked to a specific set of characteristics, those values inherit the temporal properties of their corresponding characteristics. When querying against field values or characteristics, you should consider relevancy in a given time period, leveraging the characteristics’ start_, end_, issued_, and replaced_timestamps, as needed.


Why do some tables have an id field while others do not?

id fields help to uniquely identify rows in Data Mart in the absence of any other unique field, aiding the processes that insert and update Data Mart entries. They are distinct from locators in that they are (1) not intended to help link to other entities, and (2) not necessarily surfaced by or consistent with identifiers provided by the Socotra API. A locator value from Data Mart will be identical to the value obtained via API or exposed in the UI. Not all entities in Data Mart will have a locator.


What version of MySQL is Data Mart on?

Version 8.0.x; you can check the specific version at any time with the SELECT VERSION(); query.