Scudata Cloud Datawarehouse

Cloud native Datawarehouse supporting private deployment
High performance / Low code / Versatility / Openness

Composition of Scudata Cloud Datawarehouse

Scudata Cloud Datawarehouse (SCD) involves related components and concepts incl. QDB, QVM, QVA, QVS, SPL, etc.

  • QDB
    QDBase, the core of SCD, is responsible for data processing and providing data services
  • QVM
    QDBase Virtual Machine, a computing resource of SCD, is bound to cloud virtual machines and dynamically created or destroyed according to request requirements. Each QVM comes with a QDB service program for processing computing tasks
  • QVA
    QDBase Virtual Allocator, an allocation system for QVM, where each QVA manages a batch of virtual machines, responds to task requests, assigns virtual machines, and starts QVM
  • QVS
    QDBase Virtual Service is a user deployed service program used to accept and process computing requests. QVS can run on users' own machines (which can be cloud based virtual machines) in either embedded or server mode
  • SPL
    Structured Process Language, a formal language for QDB

Application Mode

Public Cloud:AWS/GCP/AZure

Private deployment on public cloud

SCD characteristics

Separation of storage and computation

SCD uses files to store data, naturally supporting cloud object storage (files and objects correspond one by one), updating on a file basis.

Object Storage

Directly using cloud object storage such as S3 for lower costs

High Speed Cache

Each virtual machine (computing resource) caches hot data for high-performance computing

Separation of storage and computation

Storage and computation are completely separated. Storage can be selected and managed by users, and QDB is responsible for data caching and computation

Elastic expansion

  • Computational resources (QVM) are dynamically created or destroyed according to demand, with elastic expansion; The same user tasks will be assigned with priority to the used QVM (scheduled nearby)
  • QVMs are shared by all applications, uniformly scheduled, and paid according to usage

Lightweight Serverless

  • Schema nothing, no metadata, can directly access data for computation
  • There are no daemon sessions, no global variables, and no shared information between the two tasks except for caching
  • No need for a backend state environment related to front-end users, naturally implementing serverless

User Managed Data

  • The cloud storage for storing data is prepared by users themselves, and SCD does not own user data (only reads and maintains)
  • The data is completely private and self managed by the user, and the user has complete control over the data
  • Data does not affect each other, naturally supporting multi tenant
Low Coupling
Low cost

Private Debug

QVS supports private deployment

  • Remote debugging can be performed on user prepared VM and does not depend on SCD
  • Whether remote or local, private debugging does not consume SCD cloud resources, making it simple, convenient, and cost-effective
30% OFF


Diversified data source support

SPL supports integrated access of multiple data sources and mixed computing of multiple data sources

Real time data processing

Real time access and processing of various cloud data can fully ensure the real-time nature of data and provide real-time data services for applications

Microservice implementation

With real-time data support and the real-time computing power of SPL, it can effectively support the implementation of microservices

High performance low code

High performance

SPL based on discrete dataset models can achieve performance improvements of several to hundreds of times compared to SQL; In another term, achieving the same performance consumes less computing resources

Low code

SPL syntax supports procedural calculations and provides richer data types and operations, making complex calculations simpler and lower code compared to SQL

SCD: Reduce development and hardware costs by N times


SPL has more comprehensive functions and is simpler compared to SQL, Java, and Python, and can independently complete most data processing tasks, with a simple technology stack

SQL type cloud datawarehouse problems

  • SQL capabilities are incomplete and require the use of other technologies
  • But it is difficult to open other technologies or UDFs on the cloud
  • Finally forced to write very complex SQL
  • Not only is it difficult to implement, but the performance is also very low

SPL type cloud datawarehouse (SCD)

  • Complete SPL capabilities without the need for other technologies
  • SPL code is simple
  • Low complexity algorithms with higher performance
  • Lower resource consumption (single VM can handle it)
SCD: Comprehensive functionality, simple technology stack, and N times lower development and operation costs

External Interfaces

  • Provide access through two interfaces: SPL and HTTP
  • Application interacts directly with QVM
  • Implementing two-layer security control through executing permission control