SQL, which stands for Structured Query Language, is a language for interacting with data stored in something called a relational database.
You can think of a relational database as a collection of tables. A table is just a set of rows and columns which represents exactly one type of entity. For example, a table might represent employees in a company or purchases made, but not both.
Each row, or record, of a table contains information about a single entity. For example, in a table representing employees, each row represents a single person. Each column, or field, of a table contains a single attribute for all rows in the table.
Selecting columns. While SQL can be used to create and modify databases, the focus of this course will be querying databases. A query is a request for data from a database table (or combination of tables).
SELECT statement. For example, the following query selects the name column from the people table. In this query, SELECT and FROM are called keywords.SELECT name FROM people;SELECT name, birthdate FROM people;SELECT * FROM people;LIMITkeyword to limit the number of rows returned:SELECT * FROM people LIMIT 10;DISTINCT keyword.SELECT DISTINCT language FROM films;COUNT statement returns the number of rows in one or more columns.peopletable:SELECT COUNT(*) FROM people;COUNT on just that column.SELECT COUNT(col_name) FROM my_table;COUNT with DISTINCT to count the number of distinct values in a column.SELECT COUNT(DISTINCT birthdate) FROM people;Filtering results.
In SQL, the WHERE keyword allows you to filter based on both text and numeric values in a table. There are a few different comparison operators you can use:
= equal<> not equal< less than> greater than<= less than or equal to>= greater than or equal toImportant: in PostgreSQL (the version of SQL we're using), you must use single quotes with WHERE. Aggregate functions can't be used in WHERE clauses.
'Metropolis':(The WHERE clause always comes after the FROM statement).SELECT title FROM films WHERE title = 'Metropolis';SELECT * FROM films WHERE budget > 10000;WHERE queries by combining multiple conditions with the AND keyword.SELECT title FROM films WHERE release_year > 1994 AND release_year < 2000;AND condition.You can add as many AND conditions as you need!OR operator.When combining AND and OR, be sure to enclose the individual clauses in parentheses, like so:SELECT title FROM films WHERE (release_year = 1994 OR release_year = 1995) AND (certification = 'PG' OR certification = 'R');SELECT title FROM films WHERE release_year >= 1994 AND release_year <= 2000;BETWEEN keyword provides a useful shorthand for filtering values within a specified range. This query is equivalent to the one above:SELECT title FROM films WHERE release_year BETWEEN 1994 AND 2000;BETWEEN is inclusiveIN operator allows you to specify multiple values in a WHERE clause. If you want to filter based on many conditions, WHERE can get unwieldy. For example, using OR's, and then IN:SELECT name FROM kids WHERE age = 2 OR age = 4 OR age = 6 OR age = 8 OR age = 10;SELECT name FROM kids WHERE age IN (2, 4, 6, 8, 10);NULL represents a missing or unknown value. You can check for NULL values using the expression IS NULL.IS NULL is useful when combined with WHERE to figure out what data you're missing.SELECT COUNT(*) FROM people WHERE birthdate IS NULL;IS NOT NULL operator if you want to filter out missing values so you only get results which are not NULL.SELECT name FROM people WHERE birthdate IS NOT NULL;LIKE operator can be used in a WHERE clause to search for a pattern in a column. To accomplish this, you use something called a wildcard as a placeholder for some other values. There are two wildcards you can use with LIKE:'Data', 'DataC' 'DataCamp', 'DataMind', and so on:SELECT name FROM companies WHERE name LIKE 'Data%';'DataCamp', 'DataComp', and so on. The second query gets the names of people whose names have 'r' as the second letter. SELECT name FROM companies WHERE name LIKE 'DataC_mp';SELECT name FROM people WHERE name LIKE '_r%'
NOT LIKE operator to find records that don'tmatch the pattern you specify.Aggregate functions, Arithmetic and Aliasing.
Often, you will want to perform some calculation on the data in a database. SQL aggregate functions, to help you out with summarizing your data.
budget column of the films table:SELECT AVG(budget) FROM films;SELECT SUM(budget) FROM films WHERE release_year >= 2010;+, -, *, and /.AS keyword. For ex., the following gives you a result with the specified column names:
net_profit.avg_duration_hours.people who are no longer alive. Alias the result as percentage_dead.Sorting, Grouping and Joins
ORDER BY keyword is used to sort results in ascending or descending order according to the values of one or more columns.By default ORDER BY will sort in ascending order. If you want to sort the results in descending order, you can use the DESC keyword.
The following, gives you the titles of films sorted by release year, from newest to oldest.
SELECT title FROM films ORDER BY release_year DESC;Remark. Text values are ordered alphabetically.
ORDER BY can also be used to sort on multiple columns. It will sort by the first column specified, then sort by the next, then the next, and so on. For example, the following sorts on birth dates first (oldest to newest) and then sorts on the names in alphabetical order. The order of columns is important!SELECT birthdate, name FROM people ORDER BY birthdate, name;GROUP BY.Often you'll need to aggregate results, GROUP BY is for performing operations within groups. For example, you might want to count the number of male and female employees in your company. Here, what you want is to group all the males together and count them, and group all the females together and count them. In SQL, GROUP BY allows you to group a result by one or more columns, like so:
GROUP BY is used with aggregate functions like COUNT() or MAX(). Note that GROUP BY always goes after the FROM clause!GROUP BY with ORDER BY to group your results, calculate something about them, and then order your results. For example:
SELECT a field that is not in your GROUP BY clause without using it to calculate some kind of value about the entire group.HAVING clause.WHERE clauses. For example, the following query is invalid: SELECT release_year FROM films GROUP BY release_year WHERE COUNT(title) > 10;SELECT release_year FROM films GROUP BY release_year HAVING COUNT(title) > 10;avg_budget and avg_gross respectively.filmstable and then use it to get IMDB information from the reviewstable.