Core SAP HANA SQL script concepts- SELECT STATEMENT
Welcome to the next series of tutorials on SAP HANA SQL script which will aim at making you comfortable with the SQL side of SAP HANA. Most of us are scared of coding but let me assure you that you can pick it up fairly easily if you go through these tutorials and the exercises which will follow soon.
What data do we use?
In order to understand the concepts, I have created three tables as shown below in our 0TEACHMEHANA schema and also filled them up with 50 rows of data, each row corresponding to a unique Employee ID which is also the key field for all these tables. You can create these tables graphically, by SQL or HDB method and load data by flat file load, INSERT statements or CSV links to HDB tables .
Bottom line is that we have 3 tables each with some data. Let’s have a look at each.
Data disclaimer: The below data was generated by random data generation tools and any information displayed below has no association to reality..well.. except my name in the first row of course which is due to shameless self promotion.
The first table is EMP_MASTER. This is the major master data table containing information of Employee ID, Gender, Age, e-mail, Phone number, highest educational qualification, marital status and the number of children an employee has. This table has 50 records and the below data preview displays a section of this data.
The second table is EMP_NAMES. This is another Master data table that contains the first names and last names of the same 50 employees as in the EMP_MASTER table.
The third table is a transaction table containing the country in which the employee works and the salary he/she earns in USD.
I am attaching the data I used in all these tables if you want to use it to play around. Download it here.
Introduction to SAP HANA SQL script SELECT statement
Now, let’s get to business and start doing some analytics based on these data sets.
To start, we need to open up the SAP HANA SQL Editor like always using the SQL button below.
SAP HANA SQL SELECT Statement
In the editor, let’s write our first simple SAP HANA SQL Script code to view all the data from the EMP_MASTER table. The statement used for this is called SELECT and is the most important statement for any data analysis because it is used for reading data from tables. The * symbol after SELECT implies that you need all fields from this table. System statements, keywords and functions always appear in a reddish maroon colour as shown below whereas the table names and string values would usually be in blue. Integer values and arithmetic symbols usually appear black. As you see below, you should always provide the table name as . and not the alone.
Also, it’s best to terminate the SQL statement with a semicolon ; although you can skip it if you only have one statement. To execute the below, press the execute button in the SQL editor or F8 on your keyboard.
The result is shown in the ‘Result’ tab as captured below. Also seen below the data is a log that explains the number of records retrieved, time required to execute the statement and more.
As seen below, all 50 records were pulled from EMP_MASTER table with all fields displayed as expected with this execution.
SELECT TOP Clause
Now what if this is a fairly large table and I just wanted to see some of its data to understand what it holds. Pulling all of it would unnecessarily delay the output with a long job. But don’t worry. SQL has the TOP keyword just for this requirement. If you need only the top 10 values of a data set, write TOP 10 after the SELECT keyword as shown below. You can change the number as per your needs.
As captured below, only the top 10 records are shown.
Now SELECT * is a bad practice in coding since in real business tables you might have 200+ fields and millions of data records. You might only need a few for your analysis but you end up pulling all of them with the SELECT *. To solve this problem, always provide the field names you need after the SELECT keyword and separate them with commas. In the example below, I pick up only the fields EMP_ID, GENDER and EDUCATION from the EMP_MASTER table.
Also notice that all field names and table/schema names are enclosed in double quotes. It is a good practice to do so. I recommend doing the same.
Execute the statement and you will see that we get all the relevant data from EMP_MASTER for only the fields we requested.
Now what if we wanted to see what are the different values a field is containing in the table. For example for the Gender field, if I want to look at the values it holds in EMP_MASTER, I can use the SELECT statement as shown below.
On executing this, you can see the problem. I get 50 rows of data even though I only have distinct gender values.
SQL DISTINCT
To only get the distinct values of a column in any table, just add the keyword DISTINCT in front of the field name as shown below.
On execution, you would see that we now only get the two unique values for this field. This helps in some quick analysis sometimes.
Distinct can also be used to find unique combinations of two or more fields. For example, in the below statement, we try to find the distinct age-gender values in this pool of 50 employees.
Upon execution, we can see that there are 19 distinct combinations for these two fields. As mentioned, you can do this for any number of fields in the same way.
I hope this tutorial was helpful to you. Follow the link below to the next part where you will learn more details of the SAP HANA SQL script SELECT statement. Help this website grow by sharing this document on social media by using the icons below. Be sure to subscribe to our newsletter when the message pops up for latest alerts on new tutorials.
<<Previous Tutorial Next Tutorial>>