Excel for Data Analysis: The longer you use it, the more it frustrates.

Data Too Large to Handle

Excel has limits: it becomes very slow with hundreds of thousands of rows and cannot handle millions of rows.

Formulas Too Complex to Write

Excel is not good at handling complex operations like set operations and calculations after grouping. For example:

Find the periods of stock price increases lasting for more than 5 consecutive days.
Then, what to use?

Database?

While databases can handle large volumes of data, setting up the environment deters 90% of Excel guys. Additionally, the complexity of SQL often exceeds that of Excel.

Calculate the longest consecutive rising days for each stock

This nested code is a challenge even for professional programmers.

SELECT CODE, MAX(con_rise) AS longest_up_days
FROM (
	SELECT CODE, COUNT(*) AS con_rise
	FROM (
		SELECT CODE, DT, 
					SUM(updown_flag) OVER (PARTITION BY CODE ORDER BY CODE, DT) AS no_up_days
		FROM (
			SELECT CODE, DT, 
						CASE WHEN CL > LAG(CL) OVER (PARTITION BY CODE ORDER BY CODE, DT)  THEN 0
									ELSE 1 END AS updown_flag
			FROM stock
		)
	)
	GROUP BY CODE, no_up_days
)
GROUP BY CODE

Excel: An Intuitive Four-Step Solution

1. Sort by code and date.

2. Use formulas to calculate the consecutive rising days.

3. Group and summarize to calculate the longest consecutive rising days of each stock.

4. Collapse the display. DONE!

BI?

Though BI, leveraging databases, can handle large volumes of data, it also come with setup challenges.

BI lacks the advanced computational capabilities of SQL, leaving experienced Excel users feeling quite constrained

It appears the only viable solution is to learn programming. After all, programming can accomplish any task. However...

VBA

Poor support for tabular data, and even simple tasks require extensive code

It lacks built-in big data capabilities, and self-implementation is quite tedious.

Python

Decent capabilities for tabular data but still somewhat complex.

Calculate the longest consecutive rising days for each stock
import pandas as pd
stock_file = "StockRecords.txt"
stock_info = pd.read_csv(stock_file,sep="\t")
stock_info.sort_values(by=['CODE','DT'],inplace=True)
stock_group = stock_info.groupby(by='CODE')
stock_info['label'] = stock_info.groupby('CODE')['CL'].diff().fillna(0).le(0).astype(int).cumsum()
max_increase_days = {}
for code, group in stock_info.groupby('CODE'):
    max_increase_days[code] = group.groupby('label').size().max() – 1
max_rise_df = pd.DataFrame(list(max_increase_days.items()), columns=['CODE', 'max_increase_days'])

It also lacks built-in big data support, and you have to hard code manually

The interactivity of programming languages is also bad, nowhere near the user-friendly WYSWYG of Excel.

esProc Desktop: Four Key Advantages

1Powerful tabular data computation capability that far surpasses Python

Calculate the longest consecutive rising days for each stock
A
1=T("stock.xlsx")
2=A1.sort(DT)
3=A2.group(CODE;~.group@i(CL< CL[-1]).max(~.len()):max_increase_days)

Solving Excel puzzles along the way

Find the periods of stock price increases lasting for more than 5 consecutive days.
A
1=T("stock.xlsx")
2=A1.sort(CODE,DT)
3=A2.group@i(CODE!=CODE[-1]||CL< CL[-1])
4=A3.select(~.len()>=5)
5=A4.conj()

2Built-in Big Data Support

Cursor technology effortlessly handles big data that exceeds memory limits; simply adding options enables parallel computing to boost performance.

Calculate the longest consecutive rising days for each stock
A
1StockRecords.txt
2=file(A1).cursor@t().sortx(CODE,DT)
3=A2.group(CODE;~.group@i(CL< CL[-1]).max(~.len()):max_increase_days)
Find the periods of stock price increases lasting for more than 5 consecutive days.
A
1StockRecords.txt
2=file(A1).cursor@t().sortx(CODE,DT)
3=A2.group(CODE).conj(~.group@i(CL< CL[-1]))
4=A3.select(~.len()>=5).conj()

3Install and play, No Barriers

4Code Written in A Grid: Just Like Excel Formulas

Highly interactive, WYSWYG interface.

Find the periods of stock price increases lasting for more than 5 consecutive days.

Additionally, there is XLL to enhance Excel

To find periods of stock price increases for more than 5 consecutive days directly within Excel.
=spl("=E(?1).sort(CODE,DT).group@i(CODE!=CODE[-1]||CL< CL[-1]).select(~.len()>=5).conj()",A1:D253)