An extremely time-consuming process that turns into a nasty marathon that leaves no chance to start over as deadline looms.
An extremely time-consuming process that turns into a nasty marathon that leaves no chance to start over as deadline looms.
It takes 10 minutes or longer as well as extreme patience to build a simple report. Concurrency-intensive queries and large timespan just make the query impossible.
Slow join operations and drag-and-drop interface; the pre-aggregation module occupies too much space but has too many functional blind spots
Expensive yet low-performant in-memory database
An optimal algorithm is useless before it is implemented
Highly performant structured data computations are impossible in SQL
A great algorithm that is unimplementable is useless
Java can achieve high performance algorithms, but the methods are too complicated to be feasible
SQL – Able to design but unable to express high performance algorithms
JAVA – Implement high performance algorithms in too complicated ways
In essence, high performance comes from high development efficiency
Able to design & Easy to write
1+2=3
3+3=6
6+4=10
10+5=15
15+6=21
…
1+100=101
2+99=101
…
There are fifty 101s
50*101=5050
Relational algebra-based SQL is like the system of arithmetic that has "addition" only, while esProc SPL invents "multiplication"!
esProc offers more types of multiplication(high performance computing & storage database) to enable everyone to become Gauss(to achieve high performance algorithms fast and efficiently)
Innovative algebraic system empowers esProc with high performance
Discrete dataset model
is the "multiplication"esProc creates
Which makes high-performance algorithms easy to code and implementable
select max(continuousDays)-1 from (select count(*) continuousDays from (select sum(changeSign) over(order by tradeDate) unRiseDays from (select tradeDate, case when closePrice>lag(closePrice) over(order by tradeDate) then 0 else 1 end changeSign from stock) ) group by unRiseDays)
A triple-layer nested query is needed even when SQL uses the window function;
But can you understand?
A | |
---|---|
1 | =stock.sort(tradeDate) |
2 | =0 |
3 | =A1.max(A2=if(closePrice>closePrice[-1],A2+1,0)) |
According to the natural way of thinking, sort rows by trading dates (line1), compare the current closing price with the previous one, add 1 if it is higher and reset as 0 if it is lower, and then get the largest number (line3)
Lots of original algorithms!
A | ||
---|---|---|
1 | =file("data.ctx").create().cursor() | |
2 | =A1.groups(;top(10,amount)) | Get orders whose amounts are in top 10 |
3 | =A1.groups(area;top(10,amount)) | Get orders whose amounts are in top 10 in each area |
Transform complex full sorting to simple aggregation
A | ||
---|---|---|
1 | =file("order.ctx").create().cursor() | Ready for traversal |
2 | =channel(A1).groups(product;count(1):N) | Configure a subordinate computation at traversal |
3 | =A1.groups(area;sum(amount):amount) | Traverse records to perform grouping and get the result |
4 | =A2.result() | Get result of the subordinate computation |
One traversal returns multiple result sets
Proprietary data storage format/bin file/composite file
Store data by categories in tree structured directory system
esProc offers two data storage designs – Redundancy-based plan for external data & "Spare tire" plan for memory data
The fault-tolerant technique for computing automatically reassigns the subtask(s) on a malfunctioning node to an available node for processing
The user-defined data distribution and redundancy plan tailored to suit the current data characteristics and computing situation considerably reduces the volume of data transmitted across nodes and thus increases performance
The centerless cluster system lets programmers to manage computing nodes through coding
The design decides whether to assign a subtask to a node according to its workload (number of threads on it), which ensures balance between workload and resources
【Feature】 Concurrency-intensive; potentially complex computing tasks; instant response (in seconds); big data cluster computing
【Feature】 No concurrency; weak demand for real-time response; step-by-step computing mode
【Feature】No concurrency; no demand for real-time response; huge amount of data; high requirements of time window
Absolutely Yes!A high-efficiency storage plan is a guarantee of great performance. Both RDB and Hadoop cannot achieve high performance due to their traditional inefficient storage design.
esProc designs special and efficient data organization schemes for data respectively stored in memory, external storage and cluster to suit a variety of computing scenarios.
esProc is based on a wholly original computing model with brand-new theory and syntax for which no open-source technologies can be borrowed.
The innovative theory-based esProc abandons SQL, which cannot describe most of the low-complexity algorithms, for high performance algorithm implementation.
But it does not fail to offer a high-performance SQL interface for multidimensional analysis, for which standard way of coding is enough, to adapt to various front-end BI tools.
esProc has exclusive SPL syntax to achieve performance optimization.
SPL is easy to learn; it only takes hours to learn it and weeks to master it!
The hard part is to design optimized algorithms!
We design the following optimization process to help users succeed
We will designate a senior engineer to collaborate with our user in dealing with their first one or two computing scenarios with esProc.
Some prefatory training and tuning are necessary as most programmers are accustomed to SQL's roundabout way of thinking and not familiar with high performance algorithms.
Then users will be able to be skilled in employing dozens of performance optimization techniques to design and implement high performance algorithms.
Give a user a solution and we support him for a day. Teach the user how to reach a solution and they support themselves forever !