SyncLite Data Consolidation Platform 

Welcome to SyncLite Documentation.

SyncLite is an open-source, low-code, comprehensive, relational data consolidation platform, backed by patented technology that transforms data-intensive application development. It assists developers to rapidly create applications for desktops, edge devices, smartphones, with the capability to enable in-app data management, in-app analytics and perform real-time data consolidation from numerous application instances into one or more databases, data warehouses, or data lakes of your choice, enabling ML and AI uses cases on both edge and cloud. 


SyncLite is scalable, secure, extensible, fault-tolerant, enables wide range of use-cases, including rapidly building smart resource monitors, native SQL data stores, database migration/replication pipelines, deploying pluggable IoT data stacks, building data mesh architectures, enabling cloud databases at the edge, creating OLTP + OLAP solutions, setting up software telemetry pipelines, among others.

SyncLite data consolidation platform at the very core comprises of SyncLite logger and SyncLite consolidator and a configurable staging storage (backed by LocalFS/S3/SFTP/MinIO/Kafka/NFS/MS OneDrive/Google Drive etc.). SyncLite logger seamlessly integrates with applications across various devices, granting a SQL interface for transactional operations. It handles logging, shipping, and consolidating your SQL workload, directing it to a secure centralized staging area. No coding required.

The SyncLite consolidator, situated on a central host, efficiently merges the received SQL workload from countless devices into one or more centralized database, data warehouse, or data lake. This real-time consolidation enables real-time analytics and AI capabilities on consolidated data.

SyncLite's scalability, extensibility, fault tolerance, and exact once semantics provide a foundation for a wide range of applications.

Prerequisites

Getting Started

QUICK START With Docker Container

SyncLite Platform is now open for every developer!  Kickstart your data consolidation journey with the free developer edition, offering unlimited data consolidation from up to 100 devices into PostgreSQL/DuckDB/SQLite destination databases. 

Add this Maven dependency to your Edge/Desktop Java applications and unlock the power of SyncLite Logger:


<dependency>

<groupId>io.synclite</groupId>

<artifactId>synclite-logger</artifactId>

<version>2023.11.20</version>

</dependency>


For more details, dive into the GitHub Repo: syncliteio/SyncLiteLoggerJava: Repository to distribute SyncLite Logger for Java (github.com) 


Pull and start Docker container(s) from SyncLite Consolidator Docker Hub: syncliteio/synclite-consolidator - Docker Image | Docker Hub

(Refer "Repository Overview" for more details), on your on-prem host or cloud VM. Configure and start the consolidator job through the web GUI. 

Note: Please read SyncLite End User License Agreement and Privacy Notice carefully.

QUICK START With Release binary

INSTALLATION

Install JAVA (Windows)


cmd>java -version

openjdk version "11" 2018-09-25

OpenJDK Runtime Environment 18.9 (build 11+28)

OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)


cmd>jps -l -m

31916 jdk.jcmd/sun.tools.jps.Jps -l -m

Install JAVA (UBUNTU)

sudo apt-get update

sudo apt-get install openjdk-11-jdk

java -version

jps -l -m

Install and Configure Apache Tomcat (Windows)


 <role rolename="manager-script"/>

 <role rolename="manager-jmx"/>

 <role rolename="manager-status"/>

 <role rolename="admin-gui"/>

<role rolename="admin-script"/>

 <user username="synclite_admin" password="<password>" roles="manager-gui,manager-script,manager-jmx,manager-status,admin-gui,admin-script"/>



<multipart-config>

  <!-- 50MB max -->

  <max-file-size>52428800</max-file-size>

  <max-request-size>52428800</max-request-size>

  <file-size-threshold>0</file-size-threshold>

</multipart-config>

Change the values of max-file-size and max-request-size to 500MB e.g.


<multipart-config>

  <!-- 500MB max -->

  <max-file-size>500000000</max-file-size>

  <max-request-size>500000000</max-request-size>

  <file-size-threshold>0</file-size-threshold>

</multipart-config>


Install and Configure Apache Tomcat (Ubuntu)


#Systemd unit file for tomcat

[Unit]

Description=Apache Tomcat Web Application Container

After=syslog.target network.target


[Service]

Type=forking


Environment=CATALINA_PID=/opt/tomcat/temp/tomcat.pid

Environment=CATALINA_HOME=/opt/tomcat

Environment=CATALINA_BASE=/opt/tomcat

Environment='CATALINA_OPTS=-Xms512M -Xmx1024M -server -XX:+UseParallelGC'

Environment='JAVA_OPTS=-Djava.awt.headless=true -Djava.security.egd=file:/dev/./urandom'


ExecStart=/opt/tomcat/bin/startup.sh

ExecStop=/bin/kill -15 $MAINPID


User=<YourUserName>

Group=<YourGroupName>

UMask=0007

RestartSec=10

Restart=always


[Install]

WantedBy=multi-user.target


Make sure to specify your UserName and GroupName in the above file. 


 <role rolename="manager-script"/>

 <role rolename="manager-jmx"/>

 <role rolename="manager-status"/>

 <role rolename="admin-gui"/>

<role rolename="admin-script"/>

 <user username="synclite_admin" password="<password>" roles="manager-gui,manager-script,manager-jmx,manager-status,admin-gui,admin-script"/>



<multipart-config>

  <!-- 50MB max -->

  <max-file-size>52428800</max-file-size>

  <max-request-size>52428800</max-request-size>

  <file-size-threshold>0</file-size-threshold>

</multipart-config>


Change the values of max-file-size and max-request-size to 500MB e.g.


<multipart-config>

  <!-- 500MB max -->

  <max-file-size>500000000</max-file-size>

  <max-request-size>500000000</max-request-size>

  <file-size-threshold>0</file-size-threshold>

</multipart-config>


systemctl start tomcat 

systemctl enable tomcat 

Deploy SyncLite (Windows/Ubuntu)

DATA CONSOLIDATION FROM REMOTE APPS

You can deploy synclite-sample-app (or your application that uses SyncLiteLogger driver) on multiple remote hosts and share the synclite device stage directories from these hosts to the consolidator host by configuring a device stages. SyncLite supports a variety of  device stage types: Microsoft OneDrive, Google drive, MinIO Object Storage Server, Apache Kafka, SFTP, Directory sharing via Local Network, NFS etc.  

Once the device stage is configured, SyncLite Platform can perform real-time data consolidation from numerous devices/databases created by these remote apps. Following sections briefly describe steps to setup these device stages.  

SFTP Sharing (Windows)


ForceCommand internal-sftp

Subsystem  sftp   sftp-server.exe -d "E:\synclite_stage_user"

Match User synclite_stage_user

ChrootDirectory E:\synclite_stage_user


PermitTunnel no

AllowAgentForwarding no

AllowTcpForwarding no

X11Forwarding no

AllowUsers synclite_stage_user

DenyUsers alice bob

(Note: Deny all the other users for SFTP access). 

Save changes in this file and close the editor.

 

sftp synclite_stage_user@xyz

synclite_stage_user@xyz's password:

Connected to xyz.

sftp> pwd

Remote working directory: /

sftp> ls ../

Can't ls: "/../" not found

sftp> ls

devices

sftp> ls devices/

remote readdir("/devices/"): Bad message

sftp> mkdir devices/test_device

Couldn't create directory: Permission denied

sftp> put test.txt devices/test_device/test.txt

Uploading test.txt to /devices/test_device/test.txt

test.txt                                                                                                                                                                         100%    3     0.0KB/s   00:00

sftp>

If you get this result, then SFTP is setup for SyncLite usage. 


destination-type = SFTP

local-stage-directory = <local_stage_direcetory_path>

sftp:user-name = synclite_stage_user

sftp:password = <password>

sftp:remote-stage-directory = /devices


Microsoft OneDrive Sharing (Windows)


destination-type = MS_ONEDRIVE

local-stage-directory = <Full path of <HostA>_synclite_devices folder>



Google Drive Sharing (Windows)


destination-type = GOOGLE_DRIVE

local-stage-directory = <Full path of <HostA>_synclite_devices folder>


MinIO Object Storage Server (Windows/Ubuntu)


{

    "Version": "2012-10-17",

    "Statement": [

        {

            "Effect": "Allow",

            "Action": [

                "s3:GetBucketLocation"

            ],

            "Resource": [

                "arn:aws:s3:::<YourBucketName>"

            ]

        },

        {

            "Effect": "Allow",

            "Action": [

                "s3:PutObject"

            ],

            "Resource": [

                "arn:aws:s3:::<YourBucketName>/synclite-*"

            ]

        }

    ]

}


{

    "Version": "2012-10-17",

    "Statement": [

        {

            "Effect": "Allow",

            "Action": [

                "s3:ListBucket"

            ],

            "Resource": [

                "arn:aws:s3:::<YourBucketName>"

            ]

        },

        {

            "Effect": "Allow",

            "Action": [

                "s3:GetBucketLocation"

            ],

            "Resource": [

                "arn:aws:s3:::<YourBucketName>"

            ]

        },

        {

            "Effect": "Allow",

            "Action": [

                "s3:GetObject"

            ],

            "Resource": [

                "arn:aws:s3:::<YourBucketName>/synclite-*"

            ]

        },

        {

            "Effect": "Allow",

            "Action": [

                "s3:DeleteObject"

            ],

            "Resource": [

                "arn:aws:s3:::<YourBucketName>/synclite-*"

            ]

        }

    ]

}


destination-type = MINIO

local-stage-directory = <local_stage_direcetory_path>

minio:endpoint = <MinIO endpoint URL>

minio:bucket-name = <MinIO bucket name>

minio:access_key = <MinIO access key for the SyncLiteAppUser>

minio:secret_key = <MinIO secret key for the SyncLiteAppUser>


Amazon S3 

{

 "Version": "2012-10-17",

 "Statement": [

   {

   "Effect": "Allow",

   "Action": ["s3:PutObject"],

   "Resource": ["arn:aws:s3:::<YourBucketName>/synclite-*"]

  }

 ]

}

Click Next: Tags, Click Next: Review. On Review policy page, specify policy name as "SyncliteAppUserPolicy" and description as "Upload only access for remote apps using synclite platform for data consolidation.". Click on Create Policy.


destination-type = S3

local-stage-directory = <local_stage_direcetory_path>

s3:endpoint = <S3 endpoint URL e.g. https://s3.ap-south-1.amazonaws.com>

s3:bucket-name = <S3 bucket name>

s3:access_key = <S3 access key for the synclite-app-user>

s3:secret_key = <S3 secret key for the synclite-app-user>


{

    "Version": "2012-10-17",

    "Statement": [

        {

            "Effect": "Allow",

            "Action": [

                "s3:ListBucket"

            ],

            "Resource": [

                "arn:aws:s3:::<YourBucketName>"

            ]

        },

        {

            "Effect": "Allow",

            "Action": [

                "s3:GetObject"

            ],

            "Resource": [

                "arn:aws:s3:::<YourBucketName>/synclite-*"

            ]

        },

        {

            "Effect": "Allow",

            "Action": [

                "s3:DeleteObject"

            ],

            "Resource": [

                "arn:aws:s3:::<YourBucketName>/synclite-*"

            ]

        }

    ]

}

Click Next: Tags, Click Next: Review. On Review policy page, specify policy name as "SyncliteCosolidatorUserPolicy" and description as "Object List Get Delete access for synclite cosolidator app for performing data consolidation."  Click on Create Policy.

Apache Kafka (Windows/Ubuntu)


local-stage-directory = <local_stage_direcetory_path>

kafka:bootstrap.servers = <Comma separated list of servers>

#kafka:<any_other_documented_kafka_producer_property> = <kafka_producer_property_value>

#kafka:replication-factor = <Replication factor for kafka topics>


Refer Apache Kafka Producer Configuration to set all the required producer config properties for your Kafka cluster. 

Enabling Device Encryption

SyncLite provides an (optional) device encryption capability as a data security feature. When device encryption is enabled, every single data/log/control file generated and shipped by the SyncLite driver (which is running as part of your remote application instances) is encrypted before shipping it to the device stage (which can be any of the supported technologies e.g. SFTP/Amazon S3/Apache Kafka/MinIO/MS OneDrive/Google Drive/NFS etc.). SyncLite implements a robust encryption scheme that utilizes asymmetric encryption for added security. The SyncLite Consolidator decrypts every file post downloading and prior to processing it. This security measure (device encryption) over and above any other security measures configured for your device stage, further ensures that the data/log (generated by the applications) in transit cannot be compromised and can only be processed by SyncLiteConsolidator.

You can enable device encryption by specifying the device encryption key file in the synclite_logger.conf configuration file as below :


device-encryption-key-file = <path/of/DER/format/public-key/file>


While configuring the SyncLiteConsolidator job through Web GUI, specify the path of the DER format private-key file for "Device Decryption Key File" as part of the "Configure Device Stage" step.

You can use following steps to generate a DER format key pair.


1. Generate key pair

ssh-keygen -t rsa


2. Convert private key format

ssh-keygen -p -m PKCS8 -f ~/.ssh/id_rsa


3. Write out public key in DER format

ssh-keygen -f id_rsa.pub -e -m PKCS8 | openssl pkey -pubin -outform DER > synclite_public_key.der


4. Write out private key in DER format

openssl pkcs8 -topk8 -inform PEM -outform DER -in id_rsa -out synclite_private_key.der -nocrypt


(Note: Ensure that the private key is kept securely and is made available only for consumption of SyncLiteConsolidator.) 

Publishing SYncLite Job Statistics

SyncLite consolidator provides an ability to periodically publish currently running job statistics into prometheus. You can configure this while configuring synclite consolidator job at the “Configure Job Monitor” step.


Make sure to specify the job name as below in the prometheus.yml file as below and start prometheus push gateway. 


scrape_configs:

  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.

  - job_name: "SyncLiteConsolidator"

Using SyncLite Logger

SyncLite Logger (along with SyncLite Consolidator) enables end-to-end real-time (CDC) + data replication/consolidation from thousands of embedded databases including SQLite, DuckDB, Apache Derby and H2 which are embedded in edge/desktop applications into a wide range of industry leading database, data warehouse and data lakes. 

To utilize the SyncLite logger, follow these steps to add it as a dependency in your project. SyncLite logger is currently available as a JAR file and can be integrated into applications in various languages/platforms. Below are the instructions for different platforms:

Java:

<dependency>

<groupId>io.synclite</groupId>

<artifactId>synclite-logger</artifactId>

<version>#USE LATEST VERSION AS AVAILABLE IN MAVEN#</version>

</dependency>

Python:

Integrating the SyncLite logger into your project is straightforward, allowing you to harness its capabilities in your preferred language or platform.

Supported Embedded Database Versions:

Following are the versions supported for respective embedded databases:

Supported SQL Syntax:

Using SyncLite DB

While SyncLite Logger is embeddable in Java and Python applications, SyncLiteDB steps in as a sync-enabled standalone database server, offering seamless data replication/consolidation and real-time sync for applications built in any programming language. SyncLite DB wraps popular embedded databases like SQLite, DuckDB, Apache Derby, H2, and HyperSQL, and allows your applications to connect via JSON-based SQL requests over HTTP.

This makes SyncLite DB easy to integrate, offering real-time data sync for edge and desktop applications in any programming language.

Starting up SyncLiteDB:

synclite-db.bat --config synclite-db.conf ( OR synclite-db.sh --config synclite-db.conf on linux).


This starts the SyncLite DB server listening at the specified address.

Request and Response Structure: JSON-Based Interaction:

SyncLite DB uses a simple JSON format for receiving SQL commands and sending responses over HTTP. This makes it easy to integrate SyncLite DB with any programming language or system. Below is a detailed explanation of how the request and response structure works:

Request JSON Example:

{

    "db-type": "SQLITE",                  // Specify database type: SQLITE, DUCKDB, DERBY, H2, HYPERSQL, SQLITE_APPENDER, DUCKDB_APPENDER, DERBY_APPENDER, H2_APPENDER, HYPERSQL_APPENDER, STREAMING

    "db-path": "C:\\synclite\\users\\bob\\synclite\\job1\\test.db",      // Database file path on the local file system.

    "synclite-logger-config": "C:\\synclite\\users\\bob\\synclite\\job1\\synclite_logger.conf",  // SyncLite Logger config file path

    "sql": "initialize",               // SQL command: initialize, close, begin, commit, rollback, <Any SQL statement as supported by the respective embedded database>

    "arguments": [[1, "one"], [2, "two"]],  // Optional arguments (a batch) to bind for a given prepared statement based sql statement as a JSON array, with each inner JSON array representing arguments to bind per record e.g. INSERT INTO t1 VALUES(? ,? ) 

    "txn-handle": "f47ac10b-58cc-4372-a567-0e02b2c3d479"  // Optional transaction handle if this sql operation is to be executed as part of transaction which has been started

}

JSON Response from SyncLiteDB:

{

 "result" : true // Operation result true/false  

 "message" : "Database initialized successfully"   // Server message for the executed operation 

 "txn-handle" : "f47ac10b-58cc-4372-a567-0e02b2c3d479"  // A unique transaction handle if the executed command was "begin". This handle must be passed for all subsequent sql operations to be excecuted as part of single transaction.

 "resultset" :  [{"a":1,"b":"one"},{"a":2,"b":"two"}]  // A JSON array, with each element being a JSON object containing <ColumnName, ColumnValue> pairs for sql queries returning resultset e.g. SELECT a, b FROM t1 

}

API Overview: Six APIs for Complete Data Management

SyncLite DB simplifies data management by providing six essential APIs for various programming languages (Java, Python, C++, C#, Go, Rust, Ruby, Node.js) that cover the entire lifecycle of managing SQL-based interactions in edge and desktop environments:

{     

  "db-type": "SQLITE",     

  "db-path": "C:\\synclite\\users\\bob\\synclite\\job1\\test.db",     

  "synclite-logger-config" : "C:\synclite\users\bob\synclite\job1\synclite_logger.conf"  

  "sql": "initialize" 

}

{

  "db-path": "C:\\synclite\\users\\bob\\synclite\\job1\\test.db",     

  "sql": "insert into t1 values(?, ?)",

  "arguments": "[[1, "one"], [2, "two"]]"

}

{  

  "db-path": "C:\\synclite\\users\\bob\\synclite\\job1\\test.db",     

  "sql": "begin" 

}

{

  "db-path": "C:\\synclite\\users\\bob\\synclite\\job1\\test.db",     

  "txn-handle": "f47ac10b-58cc-4372-a567-0e02b2c3d479",

  "sql": "commit" 

}

{

  "db-path": "C:\\synclite\\users\\bob\\synclite\\job1\\test.db", 

  "txn-handle": "f47ac10b-58cc-4372-a567-0e02b2c3d479",

  "sql": "rollback" 

}

{     

  "db-path": "C:\\synclite\\users\\bob\\synclite\\job1\\test.db",     

  "sql": "close" 

}

With these simple yet powerful APIs, developers can perform any data management task, ensuring SyncLite DB fits smoothly into any application architecture.

SyncLiteDB Code Samples:

Check out the SyncLiteDBClient code samples for popular programming languages including Java, Python, C++, Rust, Go, C#, Ruby, Node.js, and more:  SyncLite Code Samples.

Using SyncLite Client

SyncLite Logger Configurations

The SyncLite Logger offers a range of valuable configurations, many of which are optimized for performance with preset default values. These configurations provide an opportunity for further refinement at the application level, allowing you to tailor the logger to your specific needs. Detailed descriptions of these configurations are provided below for your reference. 


device-name = (Optionally) specify a device name that together with SyncLite generated UUID can uniquely identify the device.

log-queue-size (Default 2147483647) = (Optionally) specify log queue size in terms of number of log records. All logs are enqueued in the logger queue . The logger thread drains the queue and performs logging. Setting this option to a smaller value may reduce memory consumption but may slow down the application during log queue full situations.

log-segment-flush-batch-size (Default 1000000) = (Optionally) specify a batch size for intermediate log flush by the logger. This option may be relevant for performance tuning for very large transactions, where log records can be flushed by the logger intermittently.

log-segment-switch-log-count-threshold (Default : 1000000) = (Optionally) specify the log count threshold after which a log segment should be closed and a new one should be created. Note that log segments are only closed at the transaction commit boundary.

log-segment-switch-duration-threshold-ms (Default 5000) = (Optionally) specify the duration after which a log segment should be closed and a new one should be created. Note that log segments are only closed at the transaction commit boundary. Also note that, whichever of the two: reaching log-segment-switch-log-count-threshold, or reaching log-segment-switch-duration-threshold-ms happen first, triggers the log segment switch ensuring the switch happens at a transaction commit boundary.

log-segment-shipping-frequency-ms (Default 5000) = (Optionally) specify the interval at which the log shipper component looks for new log files in the device directory and attempts to ship them to the specified destination.

log-segment-page-size (Default 512) = (Optionally) specify the internal page size for log segments created by logger.

use-precreated-data-backup (Default false) = (Optionally) specify to force SyncLite logger to use a manually precreated data backup as the initial data copy and avoid the automatic internal data backup creation. Note that, if an older data backup is supplied then SyncLite logger may miss the transactions executed since that data backup till the point of starting the application. If you would like to use this option then you must keep the precreated backup in the same directory as that of the SQLite database file with the name of the backup file as <database_file_name>.synclite.backup

vacuum-data-backup (Default false) = (Optionally) specify to enforce vacuum operation on the backup database file. It might be helpful to reduce backup size with this option thereby reducing the amount of data shipped over the network to the remote location.

include-tables (Default NULL which means all tables are included by default) = (Optionally) specify a subset of tables as a comma separated list of table names, which need to be consolidated. Unspecified tables will not be included for data consolidation. If you are using this option then your application must ensure to not use SyncLiteConnection for any write operations on the excluded tables to avoid logging for those tables and instead should continue using SQLiteConnection.

exclude-tables (Default NULL which means no tables are excluded by default) = (Optionally) specify a subset of tables as a comma separated list of table names, which need to be excluded from consolidation. All tables except these specified tables will be included for consolidation. If you are using this option then your application must ensure to not use SyncLiteConnection, SyncLiteTelemetryConnection and SyncLiteAppenderConnection for any write operations on the excluded tables to avoid logging for those tables and should continue using SQLiteConnection (by using a SQLite connection string jdbc:sqlite instead of SyncLite).

device-encryption-key-file (Default NULL) = (Optionally) Specify the <path/of/DER/format/public-key/file> to use for encrypting data, log and control files generated by the logger before shipping to the stage.

disable-async-logging-for-transactional-device (Default false) = (Optionally) turn OFF asynchronous logging for transactional devices which is the default method for transactional devices. Asynchronous logging internally uses a queue for staging SQL logs generated by user transactional and a separate thread to write logs to the log file. Setting this option to true will make logging synchronous and as part of the same user transaction thread.

enable-async-logging-for-appender-device (Default false) = (Optionally) turn ON synchronous logging for appender devices. The default logging method for Appender devices is synchronous which means logging as part of the same user transaction thread, setting this option to false will make it use a queue based asynchronous mechanism.

local-data-stage-directory (Default Same directory as that of SQLite database file) = (Optionally) specify a location where SyncLite should create the device directory locally.

local-command-stage-directory = Specify a location where SyncLite logger should download command files received from SyncLite consolidator. This must be specified when enable-command-handler is set to true.

destination-type (Default FS which means the device directory contents are not shipped anywhere) = Specify SFTP/MINIO/S3/KAFKA/MS_ONEDRIVE/GOOGLE_DRIVE to enable remote shipping of the device directory contents to the configured stage location.

#==============SFTP Configuration=======================================

sftp:host = Hostname/IP of the host running SFTP server

sftp:port = Port number of SFTP server (usually 22)

sftp:user = SFTP username 

sftp:password = SFTP user password

sftp:remote-data-stage-directory = Remote data stage directory to host device directories

sftp:remote-command-stage-directory = Relevant only when enable-command-handler is set to true. Remote command stage directory to host device command files generated at SyncLite Consolidator

#==============MinIO Configuration=======================================

minio:endpoint = MinIO endpoint to upload devices ( e.g. https://play.min.io:9000)

minio:data-stage-bucket-name = MinIO bucket name to stage SyncLite device directories

minio:command-stage-bucket-name = MinIO bucket name to stage device command files generated at SyncLite Consolidator

minio:access-key = MinIO endpoint access key. Make sure that this access key only has upload access configured

minio:secret-key = MinIO endpoint secret key


#==============S3 Configuration===========================================

s3:endpoint = S3 endpoint (https://s3-<region>.amazonaws.com) to upload devices

s3:data-stage-bucket-name = S3 bucket name to stage SyncLite device directories

s3:command-stage-bucket-name = S3 bucket name to stage device command files generared by SyncLite Consolidator

s3:access-key = S3 endpoint access key. Make sure that this access key only has upload access configured.

s3:secret-key = S3 endpoint secret key


#==============Kafka Configuration=======================================

kafka-producer:bootstrap.servers = bootstrap servers e.g. localhost:9092,localhost:9093,localhost:9094

kafka-producer:<any_other_kafka_producer_property> = <kafka_producer_property_value>

kafka-consumer:bootstrap.servers = bootstrap servers e.g. localhost:9092,localhost:9093,localhost:9094

kafka-consumer:<any_other_kafka_consumer_property> = <kafka_consumer_property_value> (Specify Kafka consumer properties when enable-command-handler is set to true since SyncLite logger needs to download command data from Kafka). 


#==============Command Handler Configuration==================

enable-command-handler=false|true (Specify if device command handler should be enabled)

command-handler-type=INTERNAL|EXTERNAL (Command handler can be an INTERNAL callback object in your application which you can create in your application by extending SyncLiteCommandHandler interface, or it can be an EXTERNAL program/script which will be invoked by SyncLite logger opon receeving a command.) 

external-command-handler=synclite_command_processor.bat <COMMAND> <COMMAND_FILE> ( The <COMMAND> and <COMMAND_FILE> placeholders must be kept as is and are replaced by the actual command details when received by SyncLite logger.) 

(For Ubuntu, external-command-handler=synclite_command_processor.sh <COMMAND> <COMMAND_FILE>)

command-handler-frequency-ms=10000 (Frequency at which the command handler routine will check for new commands for processing).


SyncLite API also provides an ability to set any of these options from within your applications by using setter methods on an object of SyncLiteOptions class.


SyncLiteOptions options = new SyncLiteOptions();

options.setDestinationType(1, DestinationType.FS);

options.setLocalStageDirectory(1, Path.of("E:\\database\\stageDir1"));

options.setDestinationType(2, DestinationType.FS);

options.setLocalDataDirectory(2, Path.of("E:\\database\\dataDir2"));

...

...

You can also load options from a configuration file as a starting point and set/replace additional options as per your needs.

SyncLiteOptions options = new SyncLiteOptions();

options.loadFromFile('E:\\db\\synclite_logger.conf');

option.setDeviceName(1, 'Finance');

...


SyncLite logger provides and ability to publish device to multiple stage systems.  Options above starting from local-stage-directory are also available with a suffix -<stageIndex> e.g. You can specify that the device be published to two different stage systems: SFTP and MinIO with following config values:


local-stage-directory-1 = /home/myuser/stage1/

destination-type-1 = SFTP

sftp-1:host = 100.50.25.2

sftp-1:user = synclite

sftp-1:password = synclite

sftp-1:remote-stage-directory = /home/myuser/synclite/stageDir

local-stage-directory-2 = /home/myuser/stage2/

destination-type-2 = MINIO

minio-2:endpoint = http://100.50.25.2:9000

minio-2:access-key = synclite

minio-2:secret-key = synclite

minio-2:bucket-name = /synclite-devices


If you are using Kafka Producer API, then below are few more configuration you can add in the configuration files or directly set the java.util.Properties object which is passed to the KafkaProducer object

device-type= (Optionally) specify a device type (TELEMETRY/APPENDER) to be used for the SyncLite device which is creatred under the hood for the Kafka Producer. An APPENDER device type enables edge analytics on the event data being streamed since the data is also persisted in the local database file which is crdated per topic, and can be queried using SQL API. 

batch.size= (Optionally) set this Kafka producer configuration to define the upper limit for the batch size for accumulating multiple records together before writing them to the underlying SyncLite device. Default value is set to 1 MB.

linger.ms= (Optionally) set this Kafka producer configuation to define the upper limit of the time the Producer waits to accumulate multiple records together before writing them to the underlying SyncLite device. Default value is set to 5000. 


SYNCLITE DEVICE TYPES AND THEIR USAGE

SyncLite supports applications to create three different types of devices. Depending on the use-case, you need to choose an appropriate device type.

SyncLite Consolidator Configurations

Similarly, the SyncLite Consolidator accepts various configurations as input, all of which can be conveniently adjusted through the web GUI of the consolidator using the Configure Job wizard. This user-friendly interface streamlines the process of configuring your consolidator jobs to match your requirements. Detailed descriptions of these configurations are provided below for your reference. 

Configure SyncLite Consolidator Job

job-name (SyncLite Job Name) : Specify a job name. SyncLite creates a dedicated directory per job to maintain job data.

device-data-root (Work Directory) = The work directory where SyncLite consolidator maintains all the device directories which contain intermediate files, replicas, schema files, control files, trace files etc. 

src-app-type (Source Application Type) = SyncLite platform offers two important pre-built connector applications. SyncLite DBReader is the first one which along with SyncLite consolidator allows setting up database migration/replication pipelines. SyncLite MQTTReader is the second one which along with SyncLite consolidator allows setting up end-to-end pluggable data stacks for IOT applications which publish data to MQTT brokers. If the source application(s) which are synchronizing SyncLite devices with SyncLite consolidator are of type synclite-dbreader then select SYNCLITE-DBREADER.  If the source application(s) are of type synclite-mqttreader then select SYNCLITE-MQTTREADER. If the source application(s) are your own custom applications built using synclite-logger then select CUSTOM_APPLICATIONS

license-file (License File Path) = Path to a valid and untampered license file as received from SyncLite Support.

dst-sync-mode (Sync Mode) = Sync Mode can be CONSOLIDATION/REPLICATION. CONSOLIDATION sync mode attempts to merge the schemas and data received from all the devices into unified tables in the destination DB while REPLICATION sync mode replicates devices into separate and respective schemas/catalogs on the destination DB.

num-device-processors (Device Processors) = Number of worker threads to process devices in parallel. The default value is set to 2 * number of cores on the host. This configuration is crucial for controlling and fine-tuning scale-up of consolidator job

device-name-pattern (Allowed Device Name Pattern) = A (Java) regular expression pattern of device names to allow data consolidation of only selected devices.

device-id-pattern (Allowed Device ID Pattern) = A (Java) regular expression pattern of device ids to allow data consolidation of only selected devices.

enable-replicas-for-telemetry-devices (Enable Replicas for Telemetry Devices) = By default, replicas are disabled for telemetry devices. An edge application using a telemetry device only streams event logs without maintaining the copy of the data at the edge or on the SyncLite consolidator. This option allows maintaining a replica of entire data for that device on consolidator host (inside the device work directory), in addition to consolidation on destination DB. 

disable-replicas-for-appender-devices (Disable Replicas for Appender Devices) = By default, replicas are disabled for appender devices. An edge application using an appender device maintains a local copy of the data at the edge application and streams event logs. However, it does not maintain a replica of the ingested data at the edge or on the SyncLite consolidator. This option allows maintaining a replica of entire data of that device on consolidator host (inside the device work directory) , in addition to consolidation on destination DB. 

skip-bad-txn-files (Skip Missing/Corrupt Transaction Files) = By default, set to false. For transactional devices (DuckDB, Apache Derby, H2, HyperSQL) supporting concurrent writers, SyncLite logger generates separate transaction log file for each trasnaction. All these transaction log files are referred by the sqllog files which maitain the transactional order of these individual transaction log files. If this option is set to true, then SyncLite Consolidator continues to process the next log record instead of failing when it encounters that a certain transaction log file is missing or corrupt. Use this option with caution as skipping transaction replay may result into correctness issues in the consolidated data.    

failed-device-retry-interval-s (Failed Device Retry Interval (s)) = Minimum backoff interval in seconds after which data consolidation will be retried for devices which encountered repeated failures. Default value is 30 seconds. 

device-trace-level (Consolidator Trace Level) = Consolidator trace level. DEBUG level indicates exhaustive tracing, ERROR level indicates only error reporting and INFO level indicates tracing of important events including errors in both the synclite_consolidator.trace file and individual device trace files in device directories. Default value is INFO.

enable-request-processor (Enable Online Request Processor) = Specify if an online request processor should be enabled in the consolidator job. This request processor allows the running job to accept requests from the SyncLite coonsolidator dashboard without requiring to stop the job.

request-processor-port (Request Processor Port) = Specify the port number at which the request processor should listen for accepting online requests.

enable-device-command-handler (Enable device command handler) = Specify if device command handler functionality should be enabled. If this functionality is enabled, then the Manage Devices console provides an option to send device commands to specific devices.

JVM_ARGS (JVM Arguments) = Various JVM arguments which should be set for the consolidator job Java process e.g. For setting initial and max heap size as 8 GB, you can specify -Xms8g -Xmx8g. The default value is empty string.  (Note that this is an environment variable set by the consoldiator before starting the job)


Configure SyncLite Consolidator Job Monitor

update-statistics-interval-s (Statistics Update Interval (S)) = Interval in seconds at which SyncLite consolidator job should refresh statistics in the statistics db file. Default value is 1 second. 

enable-prometheus-statistics-publisher (Enable Prometheus Statistics Publisher) = Set to true if Prometheus statistics publisher should be enabled. Default value is false.

prometheus-push-gateway-url (Prometheus Push Gateway URL) = URL of the Prometheus push gateway where SyncLite consolidator job should be forwarding dashboard statistics periodically.

prometheus-statistics-publisher-interval-s (Prometheus Statistics Publisher Interval (S)) = Interval in seconds at which SyncLite consolidator job should publish statistics to the configured Prometheus endpoint. Default value is 60 seconds. 


Configure Device Stage

device-stage-type (Device Stage Type) = Device stage is a location/system holding device directories containing data and log files for devices uploaded by numerous edge applications. It can be one of these: FS, LOCAL_SFTP, REMOTE_SFTP, MS_ONEDRIVE, GOOGLE_DRIVE, LOCAL_MINIO, REMOTE_MINIO, KAFKA, S3.

device-upload-root (Device Stage Directory) = Relevant when stage type is set to FS/LOCAL_SFTP/MS_ONEDRIVE/GOOGLE_DRIVE/ LOCAL_MINIO.  A directory on local file system holding device directories containing data and log files for devices uploaded by numerous edge applications.

device-encryption-enabled (Device Encryption Enabled) = Must set to true if device encryption has been enabled at the edge applications. Default value is false.

device-decryption-key-file (Device Decryption Key File) = Path of the private key DER file which should be used by the consolidator job to decrypt files received from remote devices.

device-scheduler-type (Device Scheduler Type) = Device scheduler type is a method in which devices are scheduled for consolidation by the consolidator job. It can be one of these: EVENT_BASED/POLLING/STATIC. Scheduler type EVENT_BASED is available for Local File System stage types having a local device stage directory (stage directory on the same host) where consolidator leverages File System events to identify arrival of new data/log files for individual devices. Scheduler type POLLING is available for all stage types where the consolidator worker threads continuously and periodically poll for new work (data/log files) in the device stage and process devices upon finding new work. Scheduler type STATIC allocates a single thread per configured destination and processes all the devices mapped for consolidation to that destination one after the other and repeats this process periodically. Default value is set depending on stage type.

device-polling-interval-ms (Device Polling Interval in ms) = When scheduler type is POLLING, this is the Interval in milliseconds at which consolidator worker threads should go and check for new work in each device (Default value is 2000 ms). When polling type is EVENT_BASED, this is the interval in milliseconds at which consolidator worker threads go and check for new work in each device (Default value is 30000 ms) regardless leveraging the file system event mechanism, this is to account for any missed file system events. When scheduler type is STATIC, this is the interval in milliseconds for which a worker thread should back off after processing all devices assigned to it and after his interval is elapsed, process all the devices again and so on.

stage-oper-retry-count (Failed Operation Retry Count) = Number of retry attempts for each failed operation on device stage (operations such as list devices, download object, object exists, delete object etc.) . After exhausting all retry attempts, a device is marked as failed and retried after 'Failed Device Retry Interval'.

stage-oper-retry-interval-ms (Failed Operation Retry Interval (ms)) = Backoff interval in milliseconds after which a failed operation is retried on device stage. 

stage-sftp-host (Remote SFTP Server Host) = Relevant only when stage type is REMOTE_SFTP. Hostname/IP of the remote SFTP server.

stage-sftp-port (Remote SFTP Server Port) = Relevant only when stage type is REMOTE_SFTP. Remote SFTP server's port number (Default value is 22).

stage-sftp-user (Remote SFTP Server User) = Relevant only when stage type is REMOTE_SFTP. Remote SFTP server's username. Make sure this user has list, read and write access to the specified remote stage directory.

stage-sftp-password (Remote SFTP Server User Password) = Relevant only when stage type is REMOTE_SFTP. SFTP User password.

stage-sftp-data-directory (SFTP Data Stage Directory) = Relevant only when stage type is REMOTE_SFTP. SFTP directory holding device directories.

stage-sftp-command-directory (SFTP Data Stage Directory) = Relevant only when stage type is REMOTE_SFTP and enable-device-command-handler is set to true. SFTP directory to hold device command files generated at SyncLite Consolidator. These command files are downloaded and processed by respective devices.

stage-minio-endpoint (Minio Endpoint) = Relevant only when stage type is LOCAL_MINIO/REMOTE_MINIO.  MinIO endpoint e.g., https://play.min.io:9000 to connect to MinIO stage and look for new devices and device logs.

stage-minio-data-bucket-name (MinIO Bucket Name) = Relevant only when stage type is LOCAL_MINIO/REMOTE_MINIO.  MinIO data bucket name holding the device directories.

stage-minio-command-bucket-name (MinIO Bucket Name) = Relevant only when stage type is LOCAL_MINIO/REMOTE_MINIO and enable-device-command-handler is set to true.  MinIO command bucket name holding command files generated at SyncLite Consolidator. These command files are downloaded and processed by respective devices.

stage-minio-access-key (MinIO Access Key) = Relevant only when stage type is LOCAL_MINIO/REMOTE_MINIO. MinIO access key to access the specified stage-minio-bucket-name bucket. Make sure that this access key has all the permissions on the bucket.

stage-minio-secret-key (MinIO Secret Key) = Relevant only when stage type is LOCAL_MINIO/REMOTE_MINIO. MinIO secret key to access the specified stage-minio-bucket-name bucket.

stage-s3-endpoint (S3 Endpoint) = Relevant only when stage type is S3.  S3 endpoint e.g. https://s3.ap-south.1.amazonaws.com to connect to MinIO stage and look for new devices and device logs.

stage-s3-data-bucket-name (S3 Bucket Name) = Relevant only when stage type is S3. S3 data bucket name holding device directories.

stage-s3-command-bucket-name (S3 Bucket Name) = Relevant only when stage type is S3 and enable-device-command-handler is set to true.  S3 command bucket name holding command files generated at SyncLite Consolidator. These command files are downloaded and processed by respective devices.

stage-s3-access-key (S3 Access Key) = Relevant only when stage type is S3. S3 access key to access the specified stage-s3-data-bucket-name and bucket. Make sure that this access key has all the permissions on the bucket.

stage-s3-secret-key (S3 Secret Key) = Relevant only when stage type is S3. S3 secret key to access the specified stage-s3-data-bucket-name bucket.


Configure Destination Databases

num-destinations (Number of Destination Databases) = Number of destination databases for data consolidation.


Configure Destination Database

dst-type-<dstIndex> (Destination Type) = Destination database type for destination db number <dstIndex>. Supported types are SQLITE, DUCKDB, MYSQL, POSTGRESQL, MSSQL, SNOWFLAKE, DATALAKE.

dst-connection-string-<dstIndex> (JDBC Connection URL) = A complete JDBC connection URL to connect to destination db number <dstIndex>. Make sure the URL contains all the properties required for a successful database connection.

dst-user-<dstIndex> (User Name) = User name to access destination db number <dstIndex>. Make sure this user has privileges to perform DDL and DML (CREATE/DROP/ALTER/RENAME/INSERT/UPDATE/DELETE etc.) and SELECT operations on the specified destination db. Note that if you have included username and password in the dst-connection-string-<dstIndex> itself then this field can be left blank.

dst-password-<dstIndex> (User Name) = User password to access destination db number <dstIndex>. Note that if you have included username and password in the dst-connection-string-<dstIndex> itself then this field can be left blank.

dst-database-<dstIndex> (Database Name) = Database/catalog name on the destination DB which number <dstIndex> should be holding the consolidated tables. Specify this field if the destination DB supports the notion of database/catalog. 

dst-schema-<dstIndex> (Schema Name) = Schema name on the destination DB number <dstIndex> which should be holding the consolidated tables. Specify this field if the destination DB supports the notion of schemas.

dst-alias-<dstIndex> (Destination DB Alias) = Alias for the destination db number <dstIndex>. This alias is used for conveniently identifying each destination database in the dashboard GUI.

dst-data-lake-data-format-<dstIndex> (Data Lake Data Format) = Relevant only when destination type is DATALAKE. Supported data lake data formats are SQLITE, DUCKDB, PARQUET, CSV, JSON. Note that when selected data format is PARQUET, CSV or JSON, an intermediate DUCKDB database file is created by consolidator for consolidating the data received from numerous devices and eventually exported into the desired format. 

dst-data-lake-object-switch-interval-<dstIndex> (Data Lake Object Switch Interval) = Relevant only when destination type is DATALAKE. Interval at which a new data lake object should be created to hold the consolidated data from all devices. Value 0 means a single data lake object is maintained. Default is 1 Day.

dst-data-lake-object-switch-interval-<dstIndex> (Data Lake Object Switch Interval Unit) = Relevant only when destination type is DATALAKE. Time unit for dst-data-lake-object-switch-interval-<dstIndex>, it can be MINUTES/HOURS/DAYS/MONTHS/YEARS. 

dst-data-lake-local-storage-dir-<dstIndex> (Data Lake Local Storage Directory) = Relevant only when destination type is DATALAKE. Local storage path (on consolidator host) for storing data lake objects.

dst-data-lake-publishing-<dstIndex> (Publish Data Lake) = Relevant only when destination type is DATALAKE. Set to true if consolidated data lake object should be published to an external/cloud storage. Default value is false.

dst-data-lake-type-<dstIndex> (Publish Data Lake To) = Relevant only when destination type is DATALAKE and dst-data-lake-publishing-<dstIndex> is set to true. Supported storages are S3/MINIO. 

dst-data-lake-s3-endpoint-<dstIndex> (Data Lake S3 Endpoint) = Relevant only when destination type is DATALAKE and dst-data-lake-publishing-<dstIndex> is set to true and dst-data-lake-type-<dstIndex> is set to S3. S3 endpoint e.g. https://s3.ap-south.1.amazonaws.com to connect to S3 and upload finalized data lake objects.

dst-data-lake-s3-bucket-name-<dstIndex> (Data Lake S3 Bucket Name) = Relevant only when destination type is DATALAKE and dst-data-lake-publishing-<dstIndex> is set to true and dst-data-lake-type-<dstIndex> is set to S3.  S3 bucket name to hold data lake objects.

dst-data-lake-s3-access-key-<dstIndex> (Data Lake S3 Access Key) = Relevant only when destination type is DATALAKE and dst-data-lake-publishing-<dstIndex> is set to true and dst-data-lake-type-<dstIndex> is set to S3.  S3 acess key to upload data lake objects to dst-data-lake-s3-nucket-name-<dstIndex>.Make sure that this access key has upload permission to the specified bucket.

dst-data-lake-s3-secret-key-<dstIndex> (Data Lake S3 Secret Key) = Relevant only when destination type is DATALAKE and dst-data-lake-publishing-<dstIndex> is set to true and dst-data-lake-type-<dstIndex> is set to S3. S3 secret key to access the specified dst-data-lake-s3-bucket-name bucket.

dst-data-lake-minio-endpoint-<dstIndex> (Data Lake MinIO Endpoint) = Relevant only when destination type is DATALAKE and dst-data-lake-publishing-<dstIndex> is set to true and dst-data-lake-type-<dstIndex> is set to MINIO. MinIO endpoint e.g. https://play.minio.io:9000 to connect to MinIO object server and upload finalized data lake objects.

dst-data-lake-minio-bucket-name-<dstIndex> (Data Lake MinIO Bucket Name) = Relevant only when destination type is DATALAKE and dst-data-lake-publishing-<dstIndex> is set to true and dst-data-lake-type-<dstIndex> is set to MINIO. MinIO bucket name to hold data lake objects.

dst-data-lake-minio-access-key-<dstIndex> (Data Lake MinIO Access Key) = Relevant only when destination type is DATALAKE and dst-data-lake-publishing-<dstIndex> is set to true and dst-data-lake-type-<dstIndex> is set to MINIO.  MinIO access key to upload data lake objects to dst-data-lake-minio-bucket-name-<dstIndex>. Make sure that this access key has upload permission to the specified bucket.

dst-data-lake-minio-secret-key-<dstIndex> (Data Lake MinIO Secret Key) = Relevant only when destination type is DATALAKE and dst-data-lake-publishing-<dstIndex> is set to true and dst-data-lake-type-<dstIndex> is set to MINIO. MinIO secret key to access the specified dst-data-lake-minio-bucket-name bucket.


Configure Data Types

dst-data-type-mapping-<dstIndex> (Data Type Mappings) = It can be one of the three values : ALL_TEXT/BEST_EFFORT/CUSTOMIZED. Select an appropriate data type mapping depending on your data consolidation use case. ALL_TEXT indicates that a column with any data type will be mapped to a variable length TEXT/VARCHAR (or an equivalent) data type on the destination type, this option is the most conservative. BEST_EFFORT indicates that the most appropriate matching data type will be identified for each source data type and used on the destination database. CUSTOMIZED allows you to override the system behavior and specify your own data type mappings in the below field. Default value is BEST_EFFORT when destination type is SQLITE else it is ALL_TEXT. 

dst-data-type-all-mappings-<dstIndex> (Source Type -> Destination Type) = All data type mappings from SyncLite to destination database. Individual data type mappings can be edited to come up with customized data type mappings. 

Configure Filter/Mapper

dst-enable-filter-mapper-rules-<dstIndex> (Enable Filter/Mapper Rules) = Set to true/false to specify if table/column filter/mapper rules are to be supplied for data consolidation/replication into specified destination.

dst-allow-unspecified-tables-<dstIndex> (Allow unspecified tables) = Relevant only when dst-enable-filter-mapper-rules-<dstIndex> is set to true. Set to false if you want to block tables for which no filter/mapper rules have been explicitly specified. Default is set to true to allow all unspecified tables. 

dst-allow-unspecified-columns-<dstIndex> (Allow unspecified columns) = Relevant only when dst-enable-filter-mapper-rules-<dstIndex> is set to true. Set to false if you want to block columns for which no filter/mapper rules have been explicitly specified. Default is set to true to allow all unspecified columns for each allowed table. 

dst-filter-mapper-rules-<dstIndex> (Filter/Mapper Rules) =  Relevant only when dst-enable-filter-mapper-rules-<dstIndex> is set to true. Specify table/column filtering/maping rules, one per line in the following format

tab1 = true

tab2 = false

tab3.col1 = true

tab3.col2 = true

tab3.col3 = false

tab4 = tab400

tab4.col1 = col401

tab4.col2 = false

In the above sample filter/mapper rules, it is specified to:


Configure Value Mapper

dst-enable-value-mapper-<dstIndex> (Enable Value Mapper) = Set to true/false to specify if incoming source values for individual table records need to be mapped to different values on destination during data consolidation/replication

dst-value-mappings-<dstIndex> (Value mappings) = Specify value mappings for source table columns in JSON format as below

{

  "tables": [

    {

      "src_table_name": "tabl",

      "columns": [

        {

          "src_column_name": "col1",

          "value_mappings": {

            "src_value_1": "dst_value_1",

            "src_value_2": "dst_value_2",

          }

        },

        {

          "src_column_name": "col2",

          "value_mappings": {

            "src_value_3": "dst_value_3",

            "src_value_4": "dst_value_4",

          }

        }

      ]

    },

    {

      "src_table_name": "tab2",

      "columns": [

        {

          "src_column_name": "col1",

          "value_mappings": {

            "src_value_a": "dst_value_a",

            "src_value_b": "dst_value_b",

          }

        }

      ]

    }

  ]

}

In the above value mappings, value mappings are specified for 


Configure Destination DB Writer

dst-insert-batch-size-<dstIndex> (Insert Batch Size (Operations)) = Specify a batch size for INSERT operations that works the best for the specified destination DB and for your use-case/SQL workload. Default is set to an appropriate value for chosen destination DB.

dst-update-batch-size-<dstIndex> (Update Batch Size (Operations)) = Specify a batch size for UPDATE operations that works the best for the specified destination DB and for your use-case/SQL workload. Default is set to an appropriate value for chosen destination DB.

dst-delete-batch-size-<dstIndex> (Delete Batch Size (Operations)) = Specify a batch size for DELETE operations that works the best for the specified destination DB and for your use-case/SQL workload. Default is set to an appropriate value for chosen destination DB.

dst-txn-retry-count-<dstIndex> (Failed Transaction Retry Count) = Number of retry attempts for each failed transaction on the destination database. After exhausting retries, a device is marked as failed and retried again after 'Failed Device Retry Interval '. Default is set to an appropriate value for chosen destination DB.

dst-txn-retry-interval-ms-<dstIndex> (Failed Transaction Retry Interval (ms)) = Backoff interval in milliseconds after which a failed transaction is retried on destination database. Default is set to an appropriate value for chosen destination DB.

dst-idempotent-data-ingestion-<dstIndex> (Enable idempotent Data Ingestion) = Set to true if each incoming INSERT operation should be applied as a combination of DELETE followed by an INSERT (or UPSERT if available on the destination db) to ensure idempotency. This option may be very useful in failure situations or when data already exists in destination DB. Please note that this option will take effect only if a given table has a primary key. Enable this option with caution as it may impact data ingestion performance. Default values is false.

dst-skip-failed-log-files-<dstIndex> (Skip Failed Log Files) = Set to true if you want consolidator to skip failed log files after all retry attempts are exhausted and then move on to next log file. Each skipped log file is preseved under <WorkDirectory>/<DeviceDirectory>/failed_logs/. Enable this option with caustion as it impacts the correctness of the data consolidated on destination database.

dst-device-schema-name-policy-<dstIndex> (Replicate Device Tables To Destination Schema with Name) =  Relevant only when dst-sync-mode is set to REPLICATION. When sync mode is set to REPLICATION then devices are replicated to individual schemas in destination DB. Value SYNCLITE_DEVICE_ID indicates that SyncLite device UUID is used as the schema name. Value SYNCLITE_DEVICE_NAME indicates that SyncLite device name is used as schema name. Value SYNCLITE_DEVICE_ID_AND_NAME indicates that a combination of both SyncLite device UUID and device name are used as schema name. Default value is SYNCLITE_DEVICE_ID_AND_NAME.

SyncLite JOB Management

Within the SyncLite ecosystem, all three tools - Consolidator, DBReader, and QReader - offer robust capabilities for creating and managing jobs. Users can create multiple jobs to orchestrate various data pipelines efficiently. For instance, when setting up a data replication job from Say FinDB to MktDB, users can designate a unique name such as "ETLFinDBToMktDB" in both DBReader and Consolidator.

SyncLite meticulously organizes all job-related data under the respective job directory, ensuring seamless management and retrieval. To streamline job monitoring and status tracking, SyncLite provides a dedicated tool called SyncLite JobMonitor. This tool enables users to conveniently view all jobs, monitor their statuses, and initiate job loading by simply clicking on the job name.

Furthermore, each of the three SyncLite tools : SyncLite DBReader, QReader and Consolidator feature Reset Job functionality. This allows users to reset or delete job data as needed, with the option to preserve job configurations and metadata. This functionality empowers users to start afresh with their jobs whenever necessary, ensuring flexibility and ease of use.

JobMonitor tool has a job scheduler functionality which allows users to schedule jobs to be started and stopped at a specified time of the day.  The Configure Scheduler functionality allows user to create a job schedule for any job of any of the three types: DBREADER, QREADER, CONSOLIDATOR, by specifying the following fields:

Please note that the scheduler must be started using the Configure and Start Scheduler button on the Configure Scheduler page and its status must be showing STARTED on the Job Monitor dashboard for the scheduler functionality to work. In the event of tomcat restart or VM/machine restart, user must open the Job Monitor tool and start the scheduler again and make sure that the scheduler status shows as STARTED on the dashboard. 

SyncLite COnsolidator Device Management

The "Manage Devices" functionality within SyncLite Consolidator provides a range of operations for effective device management. These operations empower users to control data consolidation processes based on their specific needs:

The "Manage Devices" page facilitates the execution of these device operations, allowing users to specify devices in one of four ways:

This flexible approach ensures that users can target specific devices or groups of devices using a variety of identification methods, enhancing the adaptability and ease of use of the device management functionalities.


Disaster Recovery Considerations

The SyncLite Consolidator features a clear and distinct separation of computing and storage functionalities. The designated work directory (configured during job setup) houses all crucial data and metadata essential for the smooth functioning of SyncLite Consolidator. To ensure preparedness for potential disaster scenarios affecting the host where SyncLite Consolidator is operating, it is highly advisable to employ shared storage solutions like DAS, NAS, SAN, or cloud-backed storages. Such storage mechanisms safeguard data even in the event of host disasters, enabling the seamless resumption of data consolidation on an alternate host without any loss of information.

Furthermore, if you are utilizing a local SFTP server, a local MinIO server, or any other local storage as a staging solution for the device stage directory on the host, it is recommended to implement a similar shared storage approach for the work directory. This ensures consistent disaster recovery practices across different stages of the SyncLite Consolidator's operation.

SyncLite Database ETL/Replication Tool

SyncLite Database Replication tool offers a flexible, scalable, schema-aware many-to-many database replication/migration engine. It enables to effortlessly orchestrate incremental replication pipelines, manage data with precision all with zero configuration changes required on the source database. With SyncLite DB Reader application configured to extract data from source DB into SyncLite telemetry devices which are shared with SyncLite Consolidator via a configurable staging storage, the SyncLite consolidator performs replication into one or more destination databases.

Key Features


System Requirements

Refer SyncLite DB Reader Configurations section for detailed documentation about DB Reader tool and all the available configuration options.

Refer SyncLite Consolidator Configurations for detailed documentation about SyncLite Consolidator and all the available configuration options.

Refer Real-Time Data Consolidation Platform - Database Replication (synclite.io) for more details about the SyncLite Database Replication/Migration/ETL tool.

Refer Real-Time Data Consolidation Platform - Data Integrations (synclite.io)  for more details about supported systems.

SyncLite DB Reader Configurations

The SyncLite DBReader, in conjunction with the SyncLite Consolidator and featuring a decoupled architecture, offers the flexibility to orchestrate highly scalable many-to-many database migration and incremental replication pipelines, including support for schema change replication, for a wide range of database systems.

SyncLite DBReader Configurations:

job-name (SyncLite Job Name): Specify a job name. SyncLite creates a dedicated directory per job to maintain the job data.

synclite-device-directory (SyncLite Device Directory): Specify SyncLite device directory, which is used by the DB reader to host the extracted data in SyncLite devices/databases. The default is set to <UserHome>/synclite/<jobName>/db.

src-type (Source DB Type): Select one of the database types from the dropdown.

src-connection-string (Source DB JDBC Connection String): Specify the complete JDBC connection string to connect to the source database.

src-database (Source DB Catalog/Database): Specify the source catalog/database name if the source DB supports the notion of catalog/database.

src-schema (Source DB Schema): Specify the source schema name if the source DB supports the notion of schema.

src-user (Source DB User): Specify username to connect to the source DB. If the username is already included in the src-connection-string, then this is not required to be specified. Make sure that the specified user has read permissions on all the tables and metadata objects under the specified catalog/schema.

src-password (Source DB Password): Specify the user password to connect to the source DB. If the username and password are already included in the src-connection-string, then this is not required to be specified.

src-connection-timeout-s (Source DB Connection Timeout): Specify source DB connection timeout in seconds. The default is set to 30 seconds.

src-object-type(Source DB Object Type): Specify object type as TABLE/VIEW/ALL to chose database objects to read/replicate.

src-table-name-pattern (Source DB Table Name Pattern): Specify a table name pattern (as supported by SQL like predicate, e.g. use % to match 0 or more characters, use _ to match exactly one character) if you would like to fetch metadata for only a subset of tables from the source DB. The default is set to % to allow all tables.

src-object-metadata-read-method(Source DB Object Metadata Read Method): Specify object metadata read method as NATIVE/JDBC. With NATIVE method SyncLite DBReader uses queries specific to respective source to fetch object (table/view) metadata.

src-column-metadata-read-method(Source DB Column Metadata Read Method): Specify column metadata read method as NATIVE/JDBC. With NATIVE method SyncLite DBReader uses queries specific to respective source to fetch column metadata.

src-constraint-metadata-read-method(Source DB Constraint Metadata Read Method): Specify metadata read method as NATIVE/JDBC. With NATIVE method SyncLite DBReader uses queries specific to respective source to fetch constraint metadata.

src-dbreader-processors (Source DB Reader Processors): Specify number of processors/threads that should perform parallel data extraction from source DB tables. The default is set to the number of cores on the underlying host/VM.

src-dbreader-interval-s (Source DB Reader Interval): Specify interval in seconds at which the DB reader should attempt to extract incremental data from the source DB (for tables with incremental replication enabled).

src-dbreader-batch-size (Source DB reader Batch Size): Specify batch size as the number of records to be read from a source DB table and written into the respective SyncLite telemetry device at once. The default is set to 100,000.

dbreader-trace-level (DB Reader Trace Level): Specify the trace level as INFO/ERROR/DEBUG to emit various details about the DB Reader job in the trace file.

src-infer-schema-changes (Infer and Publish Schema Changes): Specify whether the DB Reader should read source DB object metadata during each incremental replication cycle and identify column additions/deletions/alterations and publish them for replication. The default value is set to false.

src-infer-object-drops(Infer and Publish Table/View Drops): Specify whether the DB Reader should read source DB object metadata during each incremental replication cycle and identify table/view drops and publish them for replication. The default value is set to false.

src-infer-schema-changes (Infer Schema Changes): Specify whether the DB Reader should read source DB table metadata during each incremental replication cycle and identify column addition/deletion and publish it for replication.

src-reload-table-schemas (Reload Table Schemas): If you made some schema change on the source database that SyncLite DB Reader cannot detect, resulting in a failure to perform data replication, then the recommendation is to stop SyncLite DB Reader and Consolidator, perform this schema change manually on the destination database, set this option to reload table schemas to true in the DB reader configuration, and restart both the SyncLiteDBReader and SyncLiteConsolidator applications. This process makes SyncLite aware of the latest schema change applied and ensures smooth replication.

src-reload-object-schemas-on-next-job-restart (Reload All Table/View Schemas On Next Job Start) : If you are not relying on SyncLite's schema inference and replication capabilities but are managing schema changes manually on source and destination databases, then as a result the DBReader may fail to query data from changed objects from source DB. In such situations, you can use this option to make SyncLite pull fresh schema from source DB for such objects and resume replication.

src-reload-object-schemas-on-each-job-restart (Reload All Table/View Schemas On Each Job Restart): If you need each job restart to pull fresh schema from source DB for all objects, then you can turn on this option.

src-reload-objects-on-next-job-restart (Reload All Tables/Views on Next Job Start): Set this to true if you need all source DB objects to be extracted completely (full reload) all over again and published for replication on the next job start.

src-reload-objects-on-next-job-restart (Reload All Tables/Views on Next Job Start): Set this to true if you need all source DB objects to be extracted completely (full reload) all over again and published for replication on each job start.

src-numeric-value-mask (Source DB Numeric value Mask): Specify a digit between [0-9] to mask digits in numeric values for masked columns. The Configure DB Objects page allows specifying mask columns (sensitive columns such as CreditCardNumber, SSNNumber etc. which you need to be masked before replicating) for each table which should be masked. The digit specified here is used as a replacement for each digit in numeric values for mask columns. Default is set to 9.

src-alphabetic-value-mask (Source DB Alphabetic Value Mask): Specify an alphabetic character [a-zA-Z] to mask characters in textual values of masked columns. Default is set to X.

src-quote-column-names(Quote Source DB Column Names) : Set this to true if you need SyncLite to add quotations around column names while querying data from source DB. Please note that, if column names contain whitespaces then SyncLite will always add quoations around such column names while querying source DB.

src-dbreader-object-record-limit (Source DB Record Limit Per Table/View): Specify a record limit per table. If a non-zero value is specified, then the reading is limited to only that many records. The default value is 0 meaning all records are considered for replication from each source table. This feature may be useful when a user needs to extract and replicate only a small sample of data from source DB tables, do some analysis, decision making etc. and then run the full-fledged data extraction/replication. 

synclite-logger-configuration-file (SyncLite Logger Configuration File Path): Set this to the path of the SyncLite Logger Configuration file. The default is set to <UserHome>/synclite/db/synclite_logger.conf. The GUI provides a configuration named "SyncLite Logger Configuration" with a textarea to specify all the SyncLite logger configurations as needed. Refer to the SyncLite Logger Configurations section for more details about the SyncLite Logger Configurations.

JVM_ARGS (JVM Arguments): Various JVM arguments that should be set for the DB reader job Java process (e.g., for setting the initial and max heap size as 8 GB, you can specify -Xms8g -Xmx8g). The default value is an empty string. (Note that this is an environment variable set by the DB reader before starting the job)

The "Configure DB Tables" GUI screen fetches and shows all the source DB tables extracted from the source DB. Following are the various metadata details and required configurations for each table entry.

Table Name: Name of the source DB table.

Allows Columns: A JSON array of columns to be allowed for replication for this source DB table. Each column is listed in the format: <ColName> <DataType> <NULL|NOT NULL>. If you want to block one or more specific columns from publishing for replication, then remove them from this list. Note that the format for the column list specification after modifications remains a correct JSON array.

Primary/Unique Key Columns: A comma-separated list of columns that are part of the primary key or a unique key (if no PK present) in that table. SyncLite strongly recommends having a PK/UK created on each table for replication to work correctly and efficiently. If a table does not have a PK/UK defined, then specify the set of columns maintained as unique by your applications (basically a logical unique key which is unique in that table though not defined explicitly) for that table.

Incremental Key Column: Specify a timestamp/datetime or a monotonically increasing numeric column present in a table to be used as a basis for incremental replication from this table. Typically, a column like last_update_time is usually present for auditing purposes, which is updated by applications for every UPDATE/INSERT of a record in that table. Such a column is the best candidate to specify as an incremental key. SyncLite DBReader performs incremental data extraction from a table based on the value of this column. It maintains a bookmark of the last replicated value of the incremental key column for each table in each replication cycle and uses that as a lower watermark for the next replication cycle. Please make sure that all your applications update the value for this column for each INSERT/UPDATE operation; this is required for SyncLite DB Reader to correctly identify the modified record since the last replication cycle. If an incremental key is not specified for a table, then only a one-time replication is carried out for that table. SyncLite strongly recommends creating an index on this incremental key column on the source DB table to ensure the queries run by the DB reader on the source DB for data extraction run with good performance and do not slow down the source DB.

Mask Columns: Specify a comma-separated list of one or more columns containing sensitive data. Specify the columns for which data masking is required before initiating the replication process. Examples of sensitive columns may include CreditCardNumber, SocialSecurityNumber, EmailAddress, PhoneNumber etc. Ensure that the data in these columns is appropriately masked to safeguard sensitive information during the replication process.

Delete Condition: SyncLite leverages a soft delete mechanism for replicating DELETE operations if your applications implement it. For example, if your applications simply mark the record as deleted by setting a boolean/integer/character column, say is_deleted, to values like true/1/'Y' etc. (and do not actually delete the records), then specify this (equality) condition of deleted record here for that table e.g. is_deleted = true. SyncLite will leverage this condition to perform an actual delete operation on the destination database for such soft-deleted rows in the source DB.

Select Conditions: Specify SQL predicates on the current table using JSON array format to selectively extract data. When multiple entries are specified in this JSON array, parallel data extraction is implemented for each condition, optimizing the extraction process and enhancing performance.

Enable: You can use the Enable checkbox to enable/disable data extraction for certain tables.

Delete-Sync Job:

If your applications do not implement a soft delete mechanism, SyncLite DB Reader has an additional mechanism, a delete-sync job. This job downloads Primary key/Unique key values from each table and pushes them to the consolidator through SyncLite devices. The SyncLite Consolidator then performs a deletion of all the records from each destination table that do not exist in the received PK/UK values for that table, thus performing a delete replication. This job is recommended to be run during maintenance hours, as this is a resource-intensive operation, requiring a download of PK/UK values for entire tables, with the consolidator processing all these records, loading them into a table (created temporarily), and performing a table-wide delete operation on each destination table. Note that the delete-sync functionality is available only for tables with PK/UK constraints defined.

SyncLite Rapid IOT Data Connector Tool

SyncLite's pluggable IoT data connector enables rapid development of IoT applications. Effortlessly read data at massive scales from MQTT brokers through your gateways and seamlessly consolidate it onto one or more databases, data warehouses, or data lakes of your choice. This allows you to focus on solving your core challenges without the hassle of intricate data management. 

The SyncLite MQTTReader, coupled with SyncLite Consolidator and featuring a decoupled architecture, empowers users to effortlessly orchestrate a comprehensive IoT data architecture within minutes. In this dynamic setup, a multitude of IoT devices transmits substantial data to their respective MQTT brokers deployed on gateways. Each MQTT broker, seamlessly integrated with SyncLite MQTTReader, efficiently reads and channels messages with remarkable throughput into SyncLite telemetry devices. These telemetry devices synchronize to a centralized SyncLite Consolidator, facilitating real-time data consolidation. The consolidated data can be seamlessly directed into one or more databases, data warehouses, data lakes, or timeseries databases, providing users with unparalleled flexibility in choosing their preferred data destination.

Key Features

Refer SyncLite MQTT Reader Configurations section for detailed documentation about SyncLite MQTT Reader tool and all the available configuration options.

Refer SyncLite Consolidator Configurations for detailed documentation about SyncLite Consolidator and all the available configuration options.

SyncLite QReader Configurations

The SyncLite MQTTReader, coupled with SyncLite Consolidator and featuring a decoupled architecture, empowers users to effortlessly orchestrate a comprehensive IoT data architecture within minutes. In this dynamic setup, a multitude of IoT devices transmits substantial data to their respective MQTT brokers deployed on gateways. Each MQTT broker, seamlessly integrated with SyncLite MQTTReader, efficiently reads and channels messages with remarkable throughput into SyncLite telemetry devices. These telemetry devices synchronize to a centralized SyncLite Consolidator, facilitating real-time data consolidation. The consolidated data can be seamlessly directed into one or more databases, data warehouses, data lakes, or timeseries databases, providing users with unparalleled flexibility in choosing their preferred data destination.

SyncLiteMQTTReader Configurations:

Configure SyncLite MQTT Reader:

mqtt-broker-url (MQTT Broker URL): Specify the MQTT broker URL (e.g., tcp://localhost:1883).

mqtt-broker-user (MQTT Broker User): Specify the username (if configured) to connect to the MQTT broker.

mqtt-broker-password (MQTT Broker User Password): Specify the user password (if the user is configured) to connect to the MQTT broker.

mqtt-broker-connection-timeout-s (Broker Connection Timeout Interval (s)): Specify the MQTT broker connection timeout in seconds.

mqtt-broker-connection-retry-interval-s (Broker Connection Retry Interval(s)): Specify the MQTT broker connection retry interval in seconds.

mqtt-qos-level (MQTT Quality of Service Level): Specify the MQTT quality of service level. The default is set to At most once (0).

mqtt-clean-session (MQTT Clean Session): Specify if a clean MQTT session should be started on connection.

mqtt-message-format (Message Format): Specify the message format: CSV/JSON.

mqtt-message-header-delimiter (Message Header Delimiter): SyncLite MQTT reader expects each message header in a format <deviceName>/<topicName> by default. Change the delimiter in this format from the default / to any other character as needed using this configuration.

mqtt-message-field-delimiter (Message Field Delimiter): Specify the message field delimiter. This is relevant when the message format is CSV. The default is set to a comma.

mqttreader-message-batch-processing (Message Batch processing): Specify if messages should be buffered and published to SyncLite devices in batches. Enabling this option can substantially enhance message consumption rates during periods of high throughput. However, it's important to note that utilizing in-memory buffering and batching of messages may lead to message loss in the event of a tool restart. The default is set to true.

synclite-device-dir (SyncLite Device Directory): Specify the SyncLite device directory, which is used by the MQTT reader to host the extracted data in SyncLite devices/databases. The default is set to <UserHome>/synclite/db.

mqttreader-synclite-device-type (SyncLite Device Type): Specify SyncLite device type to use. Specify APPENDER if you intend to query the locally created SyncLite device files (i.e., SQLite databases) with all the incoming IoT data for in-app analytics on the SyncLite DB Reader host itself. If you only intend to stream the data to the destination database, then specify TELEMETRY. The default is set to TELEMETRY.

mqttreader-map-devices-to-single-synclite-device (Map Devices To Single SyncLite Device): Specify if all incoming devices should be mapped to a single SyncLite device. The default is set to true. Setting this to false will create a SyncLite Device for each IoT device sending messages.

mqttreader-default-synclite-device-name (Default SyncLite Device Name): Specify the default SyncLite device name that should receive all incoming messages which have no device name specified in the header. Also, this default device will receive messages from all incoming devices when the Map Devices To Single SyncLite Device option is set to true.

mqttreader-ignore-messages-for-undefined-topics (Ignore Messages From Undefined Topics): Specify if messages received for undefined/unspecified topics should be ignored.

mqttreader-default-synclite-table-name (Default SyncLite Table Name): Specify the default SyncLite table name which should receive all incoming messages from topics that are not specified/defined by the user on the Configure MQTT Topics page.

mqttreader-ignore-corrupt-messages (Ignore Corrupt Messages): Specify if corrupt/unparsable messages for topics which are defined in the Configured MQTT Topics page should be ignored.

mqttreader-corrupt-messages-synclite-table-name (SyncLite Table Name For Corrupt Messages): Specify SyncLite table name that should receive all incoming messages from topics specified/defined by the user but having an unparsable payload or payload with a different number of fields than defined by the user on the Configure MQTT Topics page.

synclite-logger-configuration-file (SyncLite Logger Configuration File Path): Set this to the path of the SyncLite Logger Configuration file. The default is set to <UserHome>/synclite/db/synclite_logger.conf. The GUI provides a configuration named "SyncLite Logger Configuration" with a textarea to specify all the SyncLite logger configurations as needed. Refer to the SyncLite Logger Configurations section for more details about the SyncLite Logger Configurations.

synclite-logger-configuration (SyncLite Logger Configuration): Specify SyncLite logger configuration. Specified device configurations are written into a .conf file and supplied to initialization of each database/device. Please note the defaults specified for local-stage-directory and destination-type.

mqttreader-trace-level (Job Trace Level): Specify the trace level as INFO/ERROR/DEBUG to emit various details about the MQTT Reader job in the trace file.

JVM_ARGS (JVM Arguments): Various JVM arguments that should be set for the DB reader job Java process (e.g., for setting the initial and max heap size as 8 GB, you can specify -Xms8g -Xmx8g). The default value is an empty string. (Note that this is an environment variable set by the MQTT reader before starting the job)

mqtt-num-topics (Number of Topics): Specify the number of topics to receive messages from.

Configure MQTT Topics:

This GUI screen allows specifying configurations for each topic, including topic name, the number of fields, SyncLite table name to store data for this topic, create table SQL statement (in SQLite syntax) for this table.

Topic Name: Specify the name of the source topic.

Topic Table Name: Specify the name of the SyncLite table to store data for this topic.

Topic Field Count: Specify the field count in each message published in this topic.

Create Table SQL: Specify the create table SQL using SQLite syntax.

Enable: You can use the Enable checkbox to enable/disable receiving messages for specific topics.