SQL Driver

From CometWiki

(Difference between revisions)
Jump to: navigation, search
(ALLATONCE vs. ROWBYROW Query Result Type)
Line 154: Line 154:
By configuring the device to use the ROWBYROW result method, instead of transferring the entire result set back in a single chunk, the result table is retrieved on demand, one row at a time.  That is, once you have submitted a query, each time a Read() or GETNEXTROW is performed, the SQL device goes out to the MySql server and retrieves the next row:
By configuring the device to use the ROWBYROW result method, instead of transferring the entire result set back in a single chunk, the result table is retrieved on demand, one row at a time.  That is, once you have submitted a query, each time a Read() or GETNEXTROW is performed, the SQL device goes out to the MySql server and retrieves the next row:
-
==Interface==
+
<pre>
 +
Open( SqlLun ) "SQL"
 +
result$ = Control( SqlLun, "CONNECT sampledb localhost testuser testpassword" ) ! Default port 3306 is OK
 +
 
 +
! Set the device to get each row on demand as we request them
 +
result$ = Control( SqlLun, "SETRESULTMETHOD ROWBYROW" )
 +
 
 +
! Use SQLQUERY for query-type SQL statements
 +
result$ = Control( SqlLun, "SQLQUERY SELECT custname,zipcode,phone FROM custtable WHERE state='CA' AND shipping=0.00" )
 +
 
 +
Do
 +
  ! Read in the next row, exception on end of result table
 +
  Read( SqlLun, DataFmt ) Excp=EOF
 +
 
 +
  ! do something with the results...
 +
  Print "name: "; name$; ", zip code: "; zipcode; ", phone number: "; phonenumber$
 +
loop
 +
 
 +
EOF:
 +
! No more rows
 +
 
 +
result$ = Control( SqlLun, "DISCONNECT" )
 +
Close( SqlLun )
 +
</pre>
 +
 
 +
So each time the Read() function is executed, the device connects to the MySql server and retrieves the next row.  The benefit of this is that you don't have to store all of the results on the local workstation, but at the cost of having to connect to the database and retrieve the next row each time you Read().  Due to the nature of the ROWBYROW result method, you cannot perform the GETALLROWS control (if you wanted to, why are you using the ROWBYROW method in the first place?).  Also, due to limitations in the connectivity of the MySql server, you '''must''' process all of the rows in a result table before you perform another query.  That is, you must Read() or GETNEXTROW until there are no more rows left.  If you try to perform another query first, you will receive an error.
 +
 
 +
==Reference==
The Sql control functions are broken into three categories: Management controls deal with the database connection and configuration, Non-Query Sql controls deal with non-query Sql statements, and Query Sql controls deal with all query-type Sql statements.  The Non-Query Sql controls are designed to deal with executing all Sql statements that don't perform queries or return result sets.  Sql commands like INSERT, CREATE, and UPDATE all perform some action in Sql, but return a simple result like the number of rows affected.  Conversely, the Query Sql controls are designed to deal with executing all the Sql statements that perform queries and return result sets.  Basically SELECTS are issued using this category, and controls are provided to get rows and results out of the returned table.
The Sql control functions are broken into three categories: Management controls deal with the database connection and configuration, Non-Query Sql controls deal with non-query Sql statements, and Query Sql controls deal with all query-type Sql statements.  The Non-Query Sql controls are designed to deal with executing all Sql statements that don't perform queries or return result sets.  Sql commands like INSERT, CREATE, and UPDATE all perform some action in Sql, but return a simple result like the number of rows affected.  Conversely, the Query Sql controls are designed to deal with executing all the Sql statements that perform queries and return result sets.  Basically SELECTS are issued using this category, and controls are provided to get rows and results out of the returned table.

Revision as of 20:52, 2 December 2009

Contents

Introduction

The Sql device allows you to connect to a MySQL database (remote or local) from within an IB application. It is designed to provide a simple way to access your mysql data for use within Comet. Once the Sql device is open, you can connect to a database and server, perform any Sql statement (queries or modifications), and read the results into a format or string. Please write any questions or comments on the discussion page - this is a work in progress and we'd really like to hear back from any developers that actively use MySql as to what you would like to see here!

Subscriber Product

The Sql device is available in Comet32 for all Comet Subscribers. It will work with c16 compiled programs as well as c32 programs. It will fail to open for those who are not a curent Comet Subscriber.

Usage

Basic Syntax

The Sql device is opened and accessed using a LUN like most devices. All of the Sql device functions are accessed through the IB Control statement, except for the read function (more on that later). For example, to connect to a database, you would execute the following code:

Open( SqlLun ) "SQL"
result$ = Control( SqlLun, "CONNECT sampledb localhost testuser testpassword 3306" )
Close( SqlLun )

All of the functions in the interface are accessed in this way, with parameters separated by spaces. The return string from a Control statement like this will start either with a "+OK" or "-ERR" to signify whether the control was issued successfully, followed by a short statement with more details (similar to the TCP device). For controls that return data from the SQL server (e.g. getting the number of rows affected, or a result set from a SELECT), there is no "+OK" or "-ERR". Instead the data returned from the server is the only thing contained in the result string. This will be described in more detail below.

Connecting to a Database/Server

You must connect to an SQL database prior to issuing any SQL queries or statements. In order to connect, we must provide the SQL server address/port and database we want to connect to, as well as the username and password to use. The port for the SQL server is optional, and will default to the standard SQL server port (3306) if none is provided. Code to connect and disconnect from an SQL database would usually look like this:

Open( SqlLun ) "SQL"
result$ = Control( SqlLun, "CONNECT sampledb localhost testuser testpassword" ) ! Default port 3306 is OK
result$ = Control( SqlLun, "DISCONNECT" )
Close( SqlLun )

result$ contains a string reporting whether each of these controls were executed successfully. The disconnect control takes no arguments. It disconnects from the current server/database, and readies the SQL device for another connection. You must disconnect a connected SQL device before you try to connect to another database/server.

Issuing SQL statements to the database/server

There are two categories of statements that can be sent to the SQL server. The first are queries and commands that return data in tables (SELECT, DESCRIBE, etc.), the second are manipulation and other commands that return simple results (INSERT, CREATE, etc.). Because these 2 categories of statements are ideologically different and they return different kinds of results, we have created 2 different controls for issuing SQL statements to the database/server. SQLEXECUTE must be used to issue all non-query simple result statements, whereas SQLQUERY must be used to issue queries and statements that return tables as results. You must use the correct control with its corresponding SQL statement, or you will not be able to access the results of your SQL statement.

Non-Query SQL Controls

Let's say you want to update your customer table so that all customers in a certain zipcode have a new shipping price. Because SQL's UPDATE does not return a table of data as a result (like SELECT would), you must use the SQLEXECUTE control:

Open( SqlLun ) "SQL"
result$ = Control( SqlLun, "CONNECT sampledb localhost testuser testpassword" ) ! Default port 3306 is OK

! Use SQLEXECUTE for non-query-type SQL statements
result$ = Control( SqlLun, "SQLEXECUTE UPDATE custtable SET shipping=9.99 WHERE zipcode=98765" )

! Can only get number of rows affected for SQLEXECUTE type statements
result$ = Control( SqlLun, "GETROWSAFFECTED" )

result$ = Control( SqlLun, "DISCONNECT" )
Close( SqlLun )

In this case, after the SQLEXECUTE control is issued, result$ contains a string reporting whether the SQL statement was executed successfully. If the UPDATE was issued to the database/server successfully, you can then use the GETROWSAFFECTED control to get the number of rows in the table that were affected by this update. In this case, result$ would contain the number of customers (rows) in the 'custtable' table that had the zipcode 98765.

Query SQL Controls

Now let's say you want to issue a query on the 'custtable' table in order to get a list of customers that are in California and have free shipping. Because you are issuing a query, you must use the SQLQUERY control. This tells the device that you are issuing a query, and that we will be asking for results in the form of a table.

Open( SqlLun ) "SQL"
result$ = Control( SqlLun, "CONNECT sampledb localhost testuser testpassword" ) ! Default port 3306 is OK

! Use SQLQUERY for query-type SQL statements
result$ = Control( SqlLun, "SQLQUERY SELECT custname,zipcode,phone FROM custtable WHERE state='CA' AND shipping=0.00" )

result$ = Control( SqlLun, "DISCONNECT" )
Close( SqlLun )

In this case, after the SQLQUERY control is issued, result$ contains a string reporting whether the SQL statement was executed successfully. There are a few different controls used to retrieve the results from an SQL query, where each one presents the results to you in a different way.

Query Results by Read()

After processing an SQL query, the results of the query are stored in a table. This table has a row for each result, and the columns represent the fields you want data for. The most straight forward method of retrieving your results from a query is to use the Read() function. Read() iterates through the result table, filling in the variables defined by the format statement with the corresponding columns of the result row. The variables are index by position only, not by name - this means that you must define your format statement to be in the same order as the columns you selected in the SQLQUERY. Here is how you might get your results using the Read() function:

DataFmt: format name$;zipcode;phonenumber$

!...open and connect...

result$ = Control( SqlLun, "SQLQUERY SELECT custname,zipcode,phone FROM custtable WHERE state='CA' AND shipping=0.00" )

Do
  ! Read in the next row, exception on end of result table
  Read( SqlLun, DataFmt ) Excp=EOF

  ! do something with the results...
  Print "name: "; name$; ", zip code: "; zipcode; ", phone number: "; phonenumber$
loop

EOF:
! No more rows

!...disconnect and close...

When the Read() function has exhausted all of the rows in the result table, it will return an exception for END_OF_FILE, which you can use to determine when you've finished iterating through your results.

Query Results by GETNEXTROW

The GETNEXTROW control processes the result table one row at a time, similar to the Read() function. But instead of populating variables in a format statement, GETNEXTROW returns a single string with all of the fields delimited by a single character. This way you can process each row however you like, and skip the process of copying the fields into variables if you don't need to. The control is used like this:

!...open and connect...

result$ = Control( SqlLun, "SQLQUERY SELECT custname,zipcode,phone FROM custtable WHERE state='CA' AND shipping=0.00" )

Do
  ! Read in the next row, exception on end of result table
  result$ = Control( SqlLun, "GETNEXTROW &" )

  ! Check for end of result table
  if sub( result$, 1, 1 ) EQ "-" GOTO EOF

  ! do something with the results...
  Print result$
loop

EOF:
! No more rows

!...disconnect and close...

The above code gets each row in a single string where the fields are delimited by '&'. Once the rows in the result table have been exhausted, the next GETNEXTROW will return an error string of the form "-ERR". This means that you can simply check for a '-' at the start of the result string to see if you have processed all rows.

Query Results by GETALLROWS

The GETALLROWS control, unlike Read() and GETNEXTROW, returns the entire result set from your query in a single string. This is useful at times when you don't necessarily need to process all of the results (for instance if you are just piping the results to a file), or if you don't need to process each row separately. The GETALLROWS control looks like this:

!...open and connect...

result$ = Control( SqlLun, "SQLQUERY SELECT custname,zipcode,phone FROM custtable WHERE state='CA' AND shipping=0.00" )

! Read all of the results into a single string buffer - row delimiter is carriage return, field delimiter is ','
result$ = Control( SqlLun, "GETALLROWS @0D@ ," )

!...disconnect and close...

The above code copies all of the rows and columns of your result table into a single string. The rows are delimited by a hex 0D (carriage return), and the fields in each row are delimited by a comma ','. You could now print result$ to a file, and have a list of all customers with their zipcodes and phone numbers that satisfy the search criteria. Be aware that there may be buffer size limitations with this control - please see the implementation details and known issues section for more details.

Advanced: Dealing with Very Large Result Sets

You should never need to use the functionality described in this section. For the vast majority of SQL queries and result processing (and i mean virtually all of it), the functions and controls described above should be more than adequate. You could stop reading this guide right now and never run into a problem that this section solves. Now that you understand just how seldom it should be used, you can read on and learn how very large result sets can be processed more efficiently.

First of all, it should be clear that we are talking about very large result sets. When the MySql server returns the result table normally, it is transferred in a single chunk of data (usually stored in a vector or other standard C++ container). This only becomes an issue if you select a large number of rows with large data fields (for instance, a million rows at 1k bytes per row would require ~100GB of local storage). Now, most of the time there are ways to trim down your result table - selecting only the columns you need instead of SELECT *, being more limiting with your WHERE clause, or even specifying a LIMIT in your query. In the rare instance that you cannot pare down your result table any more, we provide an alternate internal storage method.

By configuring the device to use the ROWBYROW result method, instead of transferring the entire result set back in a single chunk, the result table is retrieved on demand, one row at a time. That is, once you have submitted a query, each time a Read() or GETNEXTROW is performed, the SQL device goes out to the MySql server and retrieves the next row:

Open( SqlLun ) "SQL"
result$ = Control( SqlLun, "CONNECT sampledb localhost testuser testpassword" ) ! Default port 3306 is OK

! Set the device to get each row on demand as we request them
result$ = Control( SqlLun, "SETRESULTMETHOD ROWBYROW" )

! Use SQLQUERY for query-type SQL statements
result$ = Control( SqlLun, "SQLQUERY SELECT custname,zipcode,phone FROM custtable WHERE state='CA' AND shipping=0.00" )

Do
  ! Read in the next row, exception on end of result table
  Read( SqlLun, DataFmt ) Excp=EOF

  ! do something with the results...
  Print "name: "; name$; ", zip code: "; zipcode; ", phone number: "; phonenumber$
loop

EOF:
! No more rows

result$ = Control( SqlLun, "DISCONNECT" )
Close( SqlLun )

So each time the Read() function is executed, the device connects to the MySql server and retrieves the next row. The benefit of this is that you don't have to store all of the results on the local workstation, but at the cost of having to connect to the database and retrieve the next row each time you Read(). Due to the nature of the ROWBYROW result method, you cannot perform the GETALLROWS control (if you wanted to, why are you using the ROWBYROW method in the first place?). Also, due to limitations in the connectivity of the MySql server, you must process all of the rows in a result table before you perform another query. That is, you must Read() or GETNEXTROW until there are no more rows left. If you try to perform another query first, you will receive an error.

Reference

The Sql control functions are broken into three categories: Management controls deal with the database connection and configuration, Non-Query Sql controls deal with non-query Sql statements, and Query Sql controls deal with all query-type Sql statements. The Non-Query Sql controls are designed to deal with executing all Sql statements that don't perform queries or return result sets. Sql commands like INSERT, CREATE, and UPDATE all perform some action in Sql, but return a simple result like the number of rows affected. Conversely, the Query Sql controls are designed to deal with executing all the Sql statements that perform queries and return result sets. Basically SELECTS are issued using this category, and controls are provided to get rows and results out of the returned table.

The following sections show the syntax for all of the Sql Control strings:

Management Controls

These allow the programmer to control the database connection as well as any configuration options available for the Sql device.

  • Connect <database> <server> <user> <password> <port>
    • Connects to the database on the server with the parameters specified. If your database is on the local machine, just use localhost. The final three parameters are optional (port defaults to the normal Sql port - 3306, and the user/password default to blank). Although hopefully you don't have any users with blank passwords!
  • Disconnect
    • Disconnects from the database and server.
  • SetResultMethod <ALLATONCE/ROWBYROW>
    • Specifies whether the queries issued should return results using the ALLATONCE or ROWBYROW methods - more details above.
  • GetLastError
    • Returns a string containing details on the last mysql++ error.

Non-Query Sql Controls

These functions are used to issue and report results for Sql commands that are not queries (i.e. INSERT, CREATE, UPDATE, etc.).

  • SqlExecute <sql_string>
    • Issues a non-query Sql command to the database and server. SELECT should not be used with this control.
  • GetRowsAffected
    • Returns the number of rows that were affected by the last SqlExecute control.

Query Sql Controls

These functions are used to issue Sql queries and retrieve the resulting tables (i.e. SELECT statements). NOTE: result strings from Control statements are restricted to 8192 bytes! If your row or result table is longer than this, you will get a buffer overflow error.

  • SqlQuery <sql_string>
    • Issues a query Sql command to the database and server. You should only use Sql commands that return result tables with this control.
  • GetNextRow <delimiter>
    • Returns the next row as a string delimited by the parameter (single character only).
  • GetAllRows <row delimiter> <field delimiter>
    • (Can only be used on ALLATONCE queries!) Returns all of the result rows as a string delimited by the row and field delimiters (single characters only).

Future Controls?

Some day I think that it would be nice to abstract out the Sql commands by providing functions like these:

QueryOn( "customer_id", "customer_name" )
LimitOn( "customer_state", "NY" )
LimitOn( "customer_since", "1999-12-31" )
SortBy( "customer_id" )

resultSet = SendStatement()

where we can build the Sql queries based on the tables and constraints provided.

Though once we start moving into JOINs and the like, this kind of interface seems hard to implement. My impression is that anybody doing any non-trivial SQL work inside IB will probably have to have an understanding of SQL syntax and implementation.

Implementation Details

The SQL driver utilizes the MySQL++ library to interface with the MySQL server. The library is compiled and linked into our own library, which provides the interface to the IB applications. Currently the MySQL++ library as well as our library is compiled and linked dynamically (.dll). This means that both dlls are required to run comet with Sql support.

The MySQL++ library and documentation can be found at the MySQL++ Website