The most common data in business work is tabular data, such as order records, personnel information, sales contracts, etc., which is called structured data in professional terms. Excel is the most commonly used tool when dealing with tabular data. This is because excel provides easy-to-use formulas and user-friendly operations, and can complete most tasks. But sometimes it will be very tedious to do manual operations with excel when encountering some special and difficult operations, as well as repeated actions or the processing of large quantities of files. In this case, if you can program to solve the problems, you can…
For example,assuming you have the following two tables to be combined into one. Because the Order ID column is missing in the lookup table, the only way to match the orders is by Seller and Product:
If you are an experienced Excel Professional, you can rely on vlookup and index match formula. With the help of a large number of Excel practical experience, you know how to complete the work, but sometimes the task you encounter will still be very difficult. If you use esProc, it can be easily completed by simple coding with SQL.
The general process of processing…
For Example,there are two text files. Each row of both files is a string. We want to compare them row by row. To do this, we read in each row of every file as a string to form a set of strings and then perform a set operation.
paint.txt and dance.txt record IDs and names of children who enrolled in painting class and dancing class respectively. Below is part of paint.txt:
To find common records of the two files is to get their intersection.
Example: Find all children who enrolled in both painting class and dancing class…
To combine text files with same or similar structures into a single file are common for data analysts. Here we look at different types of scenarios . It’s convenient to do file combine with esProc.
A directory contains a number of text files of same column headers and structure but with different number of rows and detailed data. We want to merge these files into a single file under one set of column headers.
Example: there are same-structure text files that record daily orders under e:/orders. Each file has column headers on the first row and detailed data starting from…
Use csvsql or q, command-line tools that allow direct execution of SQL-like queries on CSVs/TSVs (and any other tabular text files ) if your query is very simple. If you want to do more complex queries or even want to query excel files directly, try esProc.
You can easily export the SQL query result to a new Excel file or text file.
Introduce an easy code to deal with the difficult problems of Excel grouping
Excel supports grouping & aggregation on data, as well as offering pivot tables to perform multidimensional analysis. But Excel methods and operations are too simple to handle complicated tasks or manage them conveniently. Such tasks include getting ranks on each subset, and filtering and sorting by aggregates. In this article, we’ll give examples of post-grouping calculations, analyze it and provide SPL code solution. SPL (Structured Process Language) is the language used by esProc, the specialized data computing engine. It’s convenient to use for handling structured computations.
The time series data refers to the business data ordered by time. The handling of it involves quarters, months, workdays and weekends, and sometimes, complicated order-based calculations. So you need to choose a scripting language that has enough computing ability. The commonly-used scripting languages include SQL, Python Pandas and esProc. Now let’s examine their capabilities one by one.
As the long-lasting and most widely-used language, SQL has been already at its peak within its model frame. Almost every simple operation has their SQL solution, including the order-based calculations of course.
The calculation of link relative ration is one example. The…
SQL (or the stored procedure) can handle most of the database computations. If the computations are complex or hard to deal in SQL, we use another programming language to read data out of the database to manipulate it. Such a programming language handles the data read and manipulation with a simple script. So we call the process the post-SQL scripting.
The scenarios that SQL is not good at handling include complex set-based operations, order-based operations, associative operations and multi-step computations, etc. Due to SQL’s incomplete set orientation and lack of explicit set data type, it’s almost impossible to reuse the…
A transposition is to rotate information from one row or column to another to change the data layout, for the purpose of making observations from a new perspective. Some transposition algorithms are simple, such as row to column, column to row and bidirectional transposition. Others are not so simple, such as dynamic transposition, transposition with inter-row calculations and join-based transposition. All are commonly seen in data analyses and thus worth a study.
Row-to-column transposition and column-to-row transposition are the simplest. Each is the other’s inverse computation.
What is a large file? A large file is a file that is too large to be read in at one time because of insufficient computer memory. In this case, direct use of desktop data tools (such as Excel) is powerless, often need to write a program to deal with it. Even if the program is written, a large file must be read in batches for calculation and processing. Finally, the batch processing results need to be properly summarized according to different calculation types, which is much more complicated than the processing of small file. There are many types of…