Tutorial 7- SQL commands for Advanced Beginners with Export to CSV and rendering to Excel

Offering 2 online sites for practicing SQL and Part 2 migrating our new data to CSV with Java or Python and displaying in Excel (first in a series)

We may want to refer back to tutorial #4 where I first introduce working with SQL

 

Many sql enthusiasts do not get beyond the basic commands like insert, update, delete and some very basic select statements. And this is okay for some folks. But at some point aren’t u curious about pursuing next steps with more advanced concepts. Wouldn’t u also like to combine SQL with some Java or Python? And we must not forget to also use SQL and Java with Excel too. Doing SQL just by itself can be a bit boring so I spice things up by integrating SQL into other formats which we will undoubtedly use at some point.

 

In this tutorial we want to spring beyond the very basic SQL commands to the advanced beginner level, then next up will be more advanced concepts in future tutorials.

 

Probably the biggest deterrent to advancing in SQL is having no data to play with or having to create some tables up front and hoping for the best. But how will beginners do this when they are barely off the starting blocks? Enter ready made online data, we can use it immediately with no lengthy setup.

 

Our agenda includes:

 

1. a live SQL site with sample tables where we can issue a command and get our result right away, kinda like sql plus in Oracle. But it is more than just a command prompt. It is online.

          Play around with some advanced beginner commands, Simple join with conditions

2. then i want to show u how to export this data to an outfile

                          Export to csv

3. with the outfile we can load our data into a java app

                         Can then display to html (tutorial 6) or render to excel  (this tutorial)

4. we can upload our data directly into a spreadsheet using java

                         Buffered reader reads the csv and then writes to excel

5. we can also read the csv outfile using Python

                        


 

So let’s give it a go. Keep this page open for instructions, open a second browser page and go to the following link at sqliteonline.com and jump right in. Keep the 2 pages side by side so u can see and follow the instructions while at the online site

 

https://sqliteonline.com/?fbclid=IwAR2tAtVmd6WKmGdq8NUV-o4993MLw94Fn0k3ztltGtZb4MEhBeq2bqSeQP0#fiddle-5c63c72ceadedhxjs2vjatx



 

This is what we see when we first enter the site, three tables already built with real data, what a bonus!

So look at the districts table and then the states table and first understand the data. There are 52 states and 437 districts. How do we know this? We do a basic count on the data

 

SELECT count(*) from congressional_districts;

 

Replace the command that is there now with this select statement and then press run.

 

Do the same count on the states and places table and after this then get familiar with what the data is all about just with a simple select * from congressional_districts command.

 

Before proceeding I want to show u how to get the DDL of this table (data definition language). Press export then sql schema and a file will be downloaded to your computer. Go to the downloads section and open it using notepad. So right click on the file and then open with notepad

Once opened u will see this following DDL which creates each of the three tables

 

CREATE TABLE congressional_districts (

   year INTEGER ,

   name TEXT ,

   geo_id TEXT ,

   total_population INTEGER ,

   white INTEGER ,

   black INTEGER ,

   hispanic INTEGER ,

   asian INTEGER ,

   american_indian INTEGER ,

   pacific_islander INTEGER ,

   other_race INTEGER ,

   median_age FLOAT ,

   total_households INTEGER ,

   owner_occupied_homes_median_value INTEGER ,

   per_capita_income INTEGER ,

   median_household_income INTEGER ,

   below_poverty_line INTEGER,

   foreign_born_population INTEGER,

   state TEXT,

   congressional_district TEXT

);

 

There are 20 columns for the congressional districts table

 

This is all part of getting to know our data from the outset. Start by studying what is in the 3 tables. Get an overall flavor of the data and how u might want to use it. I cannot stress this part enough. Also u kind of need to like the data because after all, data can be boring if u are not interested in it. Start thinking of some of the things we will do with the data. Don’t get overwhelmed just take it all in at first.

 

Now let’s find out how many congressional districts are in each state. Execute the following code:

 

SELECT states.name, count(congressional_districts.state) FROM congressional_districts,states where congressional_districts.state=states.state group by congressional_districts.state;

 

Congratulations, we have just graduated a bit beyond the beginner level. Here we are using a ‘group by clause’ and a simple join of 2 tables and we are also using the aggregate function called ‘count’.

 

Think of why we needed to group our data. With one table containing 437 records and the other only 52 records, there are multiple records for each state so we want our result to show only 52 results and  the aggregate count for each state, so group by is necessary to consolidate the result. Also note that we use dot notation when working with multiple tables in a join.

 

Now let’s pretty up our result a bit by using ‘as’ syntax to name our columns

 

SELECT states.name as state, count(congressional_districts.state) as districts_per_state FROM congressional_districts,states where congressional_districts.state=states.state group by congressional_districts.state;

 

Be careful when reading others syntax because it is perfectly legal to leave out the ‘as’ notation and u will get the same result. Try it out for yourself.

 

It is also legal to do this: we just abbreviated congressional_districts as c, pretty neat huh?

 

SELECT s.name state, count(c.state) districts_per_state FROM congressional_districts c,states s where c.state=s.state group by c.state;

 

As a beginner please be aware of these shortcuts and abbreviations. Also be aware that we are in SQLite and other SQL such as MySQL and MS SQL Server can use some slightly different syntax. This is all just a fact of life with SQL so take in the above and study it carefully. For the most part, basic SQL commands are pretty much universal across the different environments.

 

Now let’s attach a condition to our command and please self study beyond this tutorial if u want to get good at this.

 

Now how about only states with more than 5 districts? Try this


 

SELECT states.name as state, count(congressional_districts.congressional_district) as districts_per_state from congressional_districts, states where congressional_districts.state=states.state  group by states.name having districts_per_state >5;

 

We see that only 26 states fit this criteria. To achieve this result we attached a having clause after our group by clause.

 

So for now this is enough to push u over the beginner level. If u want another challenge with basic querying please see sqlzoo.net and click on each of the first 3 tutorials to test yourself in a quiz format. This really is a must to do even if u are not good at it because u will get to see many more commands than noted above.

 

https://sqlzoo.net/

 

Some of these queries are a bit tough but it will be a nice challenge. There are no answers provided on these quizz but when i got frustrated enough i just googled the questions and found answers out there. But I want u to really learn from this so give it a good try before googling the answers.



 

Part 2

 

Now for some other pretty cool stuff to try out. Let’s export our data into a csv file and then import this same csv into excel.

 

Execute whatever query u want first and then press the export button and then csv. Then go to the downloads area and open the file and u will see a csv file that has semicolon delimiters. Do a replace all on the semicolon and replace with commas and resave the file and u may also want to rename it to an easier name. Then move this file to the path we will use in this next example.

 

To import this csv file into excel first go back to tutorial 5 where we used the jxl library and use this same library below.

 

      This is the link to tutorial 5 for easy reference    tutorial 5

Create a new class and use the following code:

  And don’t forget about the correct path to the csv file and to the excel output file


 

import java.io.*;

import jxl.*;

import jxl.write.WritableWorkbook;

import jxl.write.WritableSheet;

import jxl.write.Label;

 

public class Read_txt_write_to_excel {

 

public static void main(String[] args) {

 

int total;  //total number of columns in csv file

String EXCEL_FILE_LOCATION = "C:\\users\\rickd\\desktop\\files\\MyFirstExcel.xls";

WritableWorkbook myFirstWbook = null;

Label label;

 

int col1=0;

int row=0;

String col=null;  //value of each column as a string

 

try {

   //this is where the excel output file is, see above path

myFirstWbook = Workbook.createWorkbook(new File(EXCEL_FILE_LOCATION));

 

           // create an Excel sheet

       WritableSheet excelSheet = myFirstWbook.createSheet("Sheet 1", 0);

 

String dataFileName = "c:/users/rickd/desktop/files/states.csv";

//this is the path and name for my csv input file, it could also be tab delimited

 

 

//read the csv input and count columns in first row

 

 BufferedReader in = new BufferedReader(new FileReader(dataFileName));

 String line;

       while ((line = in.readLine()) != null) {  //....this is the outer loop

      

     String[] columns = line.split(","); //create string array called columns

            //the split method parses input using comma delimiter

     total=columns.length; //get total number of elements in array = tot columns

           //for col count found above get parsed/split vals...this is the inner loop

     for (int i=0; i <total; i++){   //grab this val from above col count total

           //looping starts at position 0

         col = columns[i];  //gets col vals for each row, based on val of i          

            label = new Label(col1, row, col); //starts at col 0 row 0

                            // 0 , 0 , string val of col, first 0,0 then 1,0

             excelSheet.addCell(label); //add columns to sheet for this row pass

           

             col1=col1+1; //increment so next 'for' element goes to next column in current row

         

                     } //end inner for loop, all col vals stored in session

     

     col1 = 0;  //reset col position to 0 after each row pass

     row=row+1;  //go to next row in this outer loop - row 2 col 1 = position 0,1

            //cols and rows now held in memory

    } //end while loop for insert dialog, all rows stored in temp collection or result set

                 

             //write everything to sheet after both loops finish

      myFirstWbook.write();

          myFirstWbook.close(); //must have this each time here to write properly

          

          in.close(); //close input stream, release all cols and rows previously stored in memory

      

      //then go to path where excel file is and open it

          System.out.println("csv file is now imported into excel");

          

 } //end try

 

catch(Exception e){

System.out.println(" Exception : " + e.toString());

                 }

  

} //end main method

 

}  //end of class



 

I know we have covered a lot but u can also read a CSV using python and write it to a text file or u can just print the values or get creative and do something else with it

 

Here is the Python code:


 

import csv

 

 #Open file with "w"(write), the file will create automatically. This is where the text file is written

file = open("c:/users/rickd/desktop/files/testfile.txt", "w")

 

 #Open CSV file to read CSV, note: reading and write file should be under "with"

with open('C:/users/rickd/Desktop/files/states.csv') as csvFile:

 

       #Read CSV. we read the csv file and write it to variable called file

   readCsv = csv.reader(csvFile)

   for row in readCsv:

       #Get Values only in first 2 columns and manipulate in the file.write

       Id = row[0]

       Id1 = row[1]

 

       #Write CSV you need format it to string if the value is an int

       file.write(""+str(Id)+" "+str(Id1)+"\n")

 

 #You Must Close the FIle after writing it.

file.close()

  

#print(row)  prints all vals, if u don’t want to write the csv content to a text file

 

So this it it for now my friends.

 

Happy coding !

click here to exit this page