January 2014 |
|
March 2014 |
# mysql history file: $HOME/.mysql_history # During interactive session I can go to previous command using up/down arrowTo check if mysql is installed
mysql --help mysql --versionRun some simple tests to verify that you can retrieve information from the server.
mysqlshow mysqlshow mysql mysql -e "SELECT Host,Db,User FROM db" mysqlTo start a mysql interactive session
mysql # or mysql -h host -u user -pTo exit mysql, type ctr^d or "Q" or "QUIT".
\! pwdSome very basic commands
SELECT VERSION(), CURRENT_DATE; SELECT VERSION(), NOW(); SELECT USER(), NOW(); SELECT SIN(PI()/4), (4+1)*5;Let's try to write query results into a text file
SELECT NOW(), USER(), SIN(PI()/4), (4+1)*5 INTO OUTFILE './test1.txt'; # For some reason this file gets written as: /usr/local/mysql/data/test1.txtTo write a CSV file instead
SELECT NOW(), USER(), SIN(PI()/4), (4+1)*5 INTO OUTFILE './test1.csv' FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';To see the available databases in memory
SHOW DATABASES;Let's try to access a database named "test"
USE test;Next, let's check how many tables I have in this database
SHOW TABLES;Now let's see what information is there in a table called 'func'. If the table is not empty then we can run SELECT, JOIN, PROJECT, ... queries.
DESCRIBE func;To run the query from a script instead of interactively:
source filename; # Or, \. filenameTo run the above script from commandline (or in batch mode):
mysql db_name < script.sql
Notes:
mysql < script.sql | moreOr, save the output in a log file
mysql < script.sql > mysql.out
SELECT '<info_to_display>' AS ' ';
--html, -H # Produce HTML output. --silent, -s # Silent mode. Produce less output. --skip-column-names, -N # Do not write column names in results. --table, -t # Display output in table format. --verbose, -v # Verbose mode. --xml, -X # Produce XML output. --raw # Write column values without escape conversion --delimiter=str # Set statement delimiter. The default is ";”". --debug-info, -T --batch, -B # Print results using tab as the column separator
Data import from a text file
Here is an example
CREATE TABLE features (Store INT,Date DATE,Temperature DEC,Fuel_Price DEC,MarkDown1 DEC,MarkDown2 DEC,MarkDown3 DEC,MarkDown4 DEC,MarkDown5 DEC,CPI DEC,Unemployment DEC,IsHoliday VARCHAR(20)); LOAD DATA INFILE './features.csv' INTO TABLE features FIELDS TERMINATED BY ',' LINES TERMINATED BY '\r\n' IGNORE 1 LINES ( Store,Date,Temperature,Fuel_Price,MarkDown1,MarkDown2,MarkDown3,MarkDown4,MarkDown5,CPI,Unemployment,IsHoliday );For simple files this can also be done from command line
mysqlimport db_name textfile1 [textfile2 ...]For each text file named on the command line, mysqlimport strips any extension from the file name and uses the result to determine the name of the table into which to import the file's contents.
Practice test:
Practice test samples of SQL query are available at
http://sqlzoo.net/wiki/Main_Page
http://beginner-sql-tutorial.com
http://www.w3schools.com/sql/default.asp
SELECT pageURL, pageRank FROM Rankings WHERE pageRank > 10Query to find ad-revenue by IP prefix
SELECT SUBSTR(sourceIP, 1, 7), SUM(adRevenue) FROM userVisits GROUP BY SUBSTR(sourceIP, 1, 7)Note: The built-in function SUBSTR() or SUBSTRING() returns a substring. The first argument is the string. The second argument is the position of the first character to extract. The third argument is the number of characters to extract.
SELECT * INTO newtable [IN externaldb] FROM table1;Or we can copy only the columns we want into the new table:
SELECT column_name(s) INTO newtable [IN externaldb] FROM table1;The new table will be created with the column-names and types as defined in the SELECT statement. You can apply new names using the AS clause.
SELECT * FROM R1 UNION [ALL] SELECT * FROM R2And the result
R1 R2 R1 U R2 R1 U R2 with ALL option ------- ------- ------- ----------------------- A B A B A B A B a1 b1 a1 b1 a1 b1 a1 b1 a2 b1 a3 b4 a2 b1 a2 b1 a3 b4 a3 b4 a1 b1
Here is an example of difference (R1 - R2):
SELECT * FROM R1 EXCEPT SELECT * FROM R2And the result
R1 - R2 ------- a2 b1An example of join (or inner join, R1 JOIN R2 on A=B):
SELECT * FROM R1, R2 WHERE R1.A = R2.Aor, alternatively
SELECT * FROM R1 JOIN R2 ON R1.A = R2.ABoth of the above are ways to spell the same query. The result is
R1.A R1.B R2.A R2.B a1 b1 a1 b1
Q: Write query to find all hospitals within 5 miles of a school.
SELECT DISTINCT h.name FROM hospitals h, Schools s WHERE distance (h.location, s.location) < 5Q: Write query to find all user clicks made within 5 seconds of a page load.
SELECT * FROM Clicks c, PageLoad p WHERE abs(c.click_time - p.load_time)< 5Here is an example of outer join using the tables shown below
P J ---------------------- ---------------------- age zip disease job age zip ---------------------- ---------------------- 54 98125 heart lawyer 54 98125 20 98120 flu cashier 20 98120 33 98120 lung
SELECT * FROM P LEFT OUTER JOIN J ON P.age = J.age AND P.zip = J.zipThe result is
---------------------------- age zip disease job ---------------------------- 54 98125 heart lawyer 20 98120 flu cashier 33 98120 lung nullOne can similarly perform right outer join which will keep all elements from the second table instead. An example of JOIN and SELECT INTO statements used in the same query
SELECT sourceIP, AVG(pageRank) as avgPageRank, SUM(adRevenue) as totalRevenue INTO Temp FROM Rankings AS R, UserVisits AS UV WHERE R.pageURL = UV.destURL AND UV.visitDate BETWEEN '2000-01-05' AND '2000-01-22' GROUP BY UV.sourceIP; SELECT sourceIP, totalRevenue, avgPageRank FROM Temp ORDER BY totalRevenue DESC Limit 1;An example of distributed query
CREATE VIEW Sales AS SELECT * FROM JanSales UNION ALL SELECT * FROM FebSales UNION ALL SELECT * FROM MarSales; CREATE TABLE MarSales ( OrderID INT, CustomerID INT NOT NULL, OrderDate DATETIME NULL CHECK( DATEPART(mm, OrderDate)=3), CONSTRAINT OrderIDMonth PRIMARY KEY(OrderID) ) --- similarly, create tables for FebSales and JanSales
CREATE INDEX name_index ON Employee (Employee_Name);In practice, we first create index on a column of the table and then perform query on that column
CREATE INDEX seq_idx ON sequence(seq); SELECT seq FROM sequence WHERE seq = 'GATTACGATTA'To create a multi-column index
CREATE INDEX name_index ON Employee (Employee_Name, Employee_Age);Note: SQL Logical Operations have 3 possible values: TRUE, FALSE, and UNKNOWN. In the majority of databases, a comparison to NULL returns UNKNOWN - this is true even when comparing NULL to NULL. The correct way to check for a NULL or a non-NULL column is to use the IS NULL or the IS NOT NULL syntax.
Computer { modelNumber CHAR(30) NOT NULL, laptopModel CHAR(15), }Assume that the table stores entries for all the makes of PCs and laptops - and if it's a laptop the laptopModel field is set. Given this information, write an SQL statement that returns only the PCs and no laptops from the table above. Remember that we cannot use the equality operator (the "=") to test for a NULL column value.
SELECT * FROM Computer WHERE laptopModel IS NULL;
Compare ---------- | Numbers | ---------- | 40 | ---------- | 68 | ---------- | -20 | ---------- | 99 | ----------Here is a neat way to do this
SELECT Numbers FROM Compare ORDER BY Numbers DESC --order in descending order LIMIT 1; --retrieve only one value
2.) Suppose we want to find the 2nd highest salary in a table
SELECT MAX(Salary) FROM Employee WHERE Salary NOT IN (SELECT MAX(Salary) FROM Employee );
3.) Now, suppose we want to find Nth highest salary
SELECT Salary FROM Employee ORDER BY Salary DESC LIMIT N-1,1;where substitute 'N' by the appropriate numerical value.
CREATE TABLE EMPLOYEE ( EMPLOYEE_ID CHAR(5) NOT NULL, EMPLOYEE_NAME VARCHAR(50) NOT NULL, PRIMARY KEY (EMPLOYEE_ID) ) ENGINE = INNODB;If a table is created with one of the storage engines that supports transactions - either BDB or InnoDB - then there are two modes of transaction possible:
SET AUTOCOMMIT=0; SET AUTOCOMMIT=1;
COMMIT ROLLBACK SAVEPOINT START TRANSACTION
GRANT privilege_name /* e.g., ALL, SELECT, EXECUTE, UPDATE, INSERT, CREATE */ ON object_name /* e.g., TABLE, VIEW, SEQUENCE */ TO {user_name |PUBLIC |role_name} /* Use PUBLIC to grant access to all users */ [WITH GRANT OPTION]; /* allows a user to grant access rights to other users */For example
GRANT SELECT ON SOME_TABLE TO SOME_USER;
REVOKE privilege_name ON object_name FROM {user_name |PUBLIC |role_name};For example
REVOKE SELECT ON employee FROM user1;
CREATE USER username [IDENTIFIED BY password] [other options];
Case statement:
CASE input_expression /* usually a column_name */ WHEN conditional_expression1 THEN result_expression1 [WHEN conditional_expression2 THEN result_expression2] [ELSE result_expression] ENDFor example
SELECT LANGUAGE_NAME, RATING_CODE AS RATING, CASE RATING_CODE WHEN '3' THEN 'DIFFICULT' WHEN '2' THEN 'FAIRLY DIFFICULT' WHEN '1' THEN 'EASY' ELSE 'UNKNOWN' END AS RATING_DESCRIPTION FROM PROGRAMMING_LANGUAGES;
RANK and DENSE_RANK statements:
Use of RANK()
SELECT Employee_ID, Salary, RANK() over (ORDER BY Salary DESC) AS Ranking FROM Employee;I get an output like
Employee_ID Salary Ranking 4 800 1 3 200 2 9 200 2 12 100 4Use of DENSE_RANK()
SELECT Employee_ID, Salary, DENSE_RANK() over (ORDER BY Salary DESC) AS DenseRank FROM Employee;The result is
Employee_ID Salary DenseRank 4 800 1 3 200 2 9 200 2 12 100 3
RENAME a table:
RENAME TABLE {tbl_name} TO {new_tbl_name};For example
CREATE TABLE employees ( id NUMBER(6), name VARCHAR(20) ); INSERT INTO employees( id, name ) values( 1, 'name 1'); INSERT INTO employees( id, name ) values( 2, 'name 2'); INSERT INTO employees( id, name ) values( 3, 'name 3'); SELECT * FROM employees; RENAME TABLE employees TO employees_new; SELECT * FROM employees_new;
Create view: A view is a relation defined by a query. The syntax is
CREATE VIEW view_name AS SELECT column_list FROM table_name [WHERE condition];For example, to create a view on the product table
CREATE VIEW view_product AS SELECT product_id, product_name FROM product;Another example. Let's assume we have two tables:
CREATE VIEW StorePrice AS SELECT x.store, y.store FROM Purchase x, Product y WHERE x.pid = y.pid;
How to use a view ? Just like a normal table. For example,
SELECT DISTINCT z.name, u.store FROM customer z, Purchase u, StorePrice v WHERE z.cid = u.customer AND u.store = v.store AND v.price > 1000;
CREATE FUNCTION CubicVolume (@CubeLength decimal(4,1), @CubeWidth decimal(4,1), @CubeHeight decimal(4,1) ) RETURNS decimal(12,3) AS BEGIN RETURN ( @CubeLength * @CubeWidth * @CubeHeight ) ENDThe above function is a scalar UDF because it returns a single value of a predefined data type. This function can then be used like
SELECT CubicVolume(r.a, r.b, r.c) FROM Shape r WHERE CubicVolume(r.a, r.b, r.c) > 10;SQL also supports user-defined functions that return a table data type. Here is an example to create a table with 100,000 rows with random values from 1 to 10 in the DataVal column.
CREATE TABLE LargeTable (KeyVal int NOT NULL PRIMARY KEY, DataVal int NOT NULL CHECK (DataVal BETWEEN 1 AND 10) ); WITH Digits AS (SELECT d FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) AS d(d)) INSERT INTO LargeTable (KeyVal, DataVal) SELECT 10000 * a1.d + 1000 * a2.d + 100 * a3.d + 10 * a4.d + a5.d + 1, 10 * RAND(CHECKSUM(NEWID())) + 1 FROM Digits AS a1, Digits AS a2, Digits AS a3, Digits AS a4, Digits AS a5;We can use the above function as
SELECT DataVal FROM LargeTable( 2, 2);
vagrant up (start your virtual machine) vagrant suspend (put the virtual machine to sleep, it takes ~ a minute to complete 'vagrant up' afterward) vagrant halt (shuts down the virtual machine, it takes ~ a minute to complete 'vagrant up' afterward) vagrant ssh (login to the virtual machine) vagrant destroy (destroys the virtual machine, it takes ~ 20 minutes to complete 'vagrant up' again)
This is a three step process. Step 1: Produce W+jets events in MCFM using close-to-analysis-level cuts. Configuration can be found herehttps://github.com/kalanand/UserCode/blob/master/WJets/inputWjets_LnuJJ.DAT
Changes needed: '6.0' [file version number] 'tota' [part 'lord','real' or 'virt','tota'] 7000d0 [sqrts in GeV] Please change these to actual version number, 'lord' and 8000d0, respectively. Please also adjust the jet, lepton, and MET cuts to match the analysis requirements.Documentation: http://mcfm.fnal.gov/mcfm.pdf
----- Preparation if you have never run MCFM ----- - Please checkout MCFM from the above link or from here:https://twiki.cern.ch/twiki/bin/viewauth/CMS/MCFM
Alternatively: Run from Yurii's directory (~maravin/nobackup/MCFM/Bin, so you do not need to install yourself. --------------------------------------------------------- Step 2: Please change the factorization/renormalization scale in the following lines 1d0 [scale:QCD scale choice] 1d0 [facscale:QCD fac_scale choice] Change both to 12d0 for factorization/renormalization scale-down and to 15d0 for scale-up. Step 3: Plot the ratio histograms: scale-up/default and scale-down/default in hadronic W pt. Now apply these ratios to your Madgraph simulated W+jets events. You will get the uncertainty envelop similar to the one I had last year.Turns out that in the more recent versions of MCFM, 12d0 and 15d0 are incorrect codes for scale up/down, we should be using -3d0 and -6d0. Phil was able to debug this, here is his full message
I decided to look myself at the MCFM code, and I find these snippets in the src/Need/reader_input.f: c--- set up the default choices of static scale, if required if (scale .lt. 0d0) then if (scale .eq. -2d0) then factor=0.25d0 elseif (scale .eq. -3d0) then factor=0.5d0 elseif (scale .eq. -4d0) then factor=0.75d0 elseif (scale .eq. -5d0) then factor=1d0 elseif (scale .eq. -6d0) then factor=2d0 elseif (scale .eq. -7d0) then factor=4d0 else factor=1d0 endif and if (facscale .lt. 0d0) then if (facscale .eq. -2d0) then factor=0.25d0 elseif (facscale .eq. -3d0) then factor=0.5d0 elseif (facscale .eq. -4d0) then factor=0.75d0 elseif (facscale .eq. -5d0) then factor=1d0 elseif (facscale .eq. -6d0) then factor=2d0 elseif (facscale .eq. -7d0) then factor=4d0 else So I think 12d0 and 15d0 are the incorrect codes, I should be using -3d0 and -6d0.
Infinite geometric series:
For r ∈ (-1, 1), the geometric series is given by
Σ∞i=0 a * ri = a / (1 − r).
Fibonacci Number:
The Fibonacci numbers are the sequence of numbers
{Fn}∞n=1 defined by the linear recurrence equation
Fn = F(n-1) + F(n-2)
with F1 = F2 = 1.
It is conventional to define F0 = 0.
The Fibonacci numbers for n=1, 2, ... are 1, 1, 2, 3, 5, 8, 13, 21, ...
Fibonacci numbers can be viewed as a particular case of the Fibonacci polynomials Fn(x) with Fn = Fn(1).
The closed form for Fn is given by
Fn = (αn − βn)/(α − β),
where α and β are the roots of x2 = Ax+B.
Here, A = B = 1, so the equation becomes
x2 −x −1 = 0,
which has roots
x = (1/2) * (1 ± √5).
The closed form is therefore given by
Fn = [(1 + √5)n − (1 − √5)n] / (2n √5)
Benford's law:
Benford's Law, also called the First-Digit Law, refers to the frequency distribution of digits in many real-life sources of data. In this distribution, the number 1 occurs as the leading digit about 30% of the time, while larger numbers occur in that position less frequently: 9 as the first digit less than 5% of the time.
A set of numbers is said to satisfy Benford's Law if the leading digit d
occurs with probability
P(d) = log10 (1 + 1/d).