SQL Driver

From CometWiki

Introduction

The Sql device allows you to connect to a MySQL database (remote or local) from within an IB application. It is designed to provide a simple way to access your mysql data for use within Comet. Once the Sql device is open, you can connect to a database and server, perform any Sql statement (queries or modifications), and read the results into a format or string. Please write any questions or comments on the discussion page - this is a work in progress and we'd really like to hear back from any developers that actively use MySql as to what you would like to see here!

Subscriber Product

The Sql device is available in Comet32 for all Comet Subscribers. It will work with c16 compiled programs as well as c32 programs. It will fail to open for those who are not a curent Comet Subscriber.

MySql Server Required

In order to use the SQL device, you must have a MySql Server configured and installed someplace accessible to your workstation. If you do not already have one, you can download and install one for free locally. First get the proper Mysql 5.1 Community Server installation at the MySql Website. Although the SQL device should work with all versions of MySql server, we advise you to get version 5.1 because that is what we have been using for testing. Follow the directions through the installation wizard and refer to the MySql Docs for reference. Once you have the Community Server installed and configured, you will need to configure databases and tables to work with using the CREATE DATABASE and CREATE TABLE operators in the MySql command-line client.

Sample Programs

Please see the SQL Driver Sample Programs page for examples on how to set up your MySql database and using Internet Basic to utilize it.

Using the SQL Device

Basic Syntax

The Sql device is opened and accessed using a LUN like most devices. All of the Sql device functions are accessed through the IB Control statement, except for the read function (more on that later). For example, to connect to a database, you would execute the following code:

Open( SqlLun ) "SQL"
result$ = Control( SqlLun, "CONNECT sampledb localhost testuser testpassword 3306" )
Close( SqlLun )

All of the functions in the interface are accessed in this way, with parameters separated by spaces. The return string from a Control statement like this will start either with a "+OK" or "-ERR" to signify whether the control was issued successfully, followed by a short statement with more details (similar to the TCP device). For controls that return data from the SQL server (e.g. getting the number of rows affected, or a result set from a SELECT), there is no "+OK" or "-ERR". Instead the data returned from the server is the only thing contained in the result string. This will be described in more detail below.

Connecting to a Database/Server

You must connect to an SQL database prior to issuing any SQL queries or statements. In order to connect, we must provide the SQL server address/port and database we want to connect to, as well as the username and password to use. The port for the SQL server is optional, and will default to the standard SQL server port (3306) if none is provided. Code to connect and disconnect from an SQL database would usually look like this:

Open( SqlLun ) "SQL"
result$ = Control( SqlLun, "CONNECT sampledb localhost testuser testpassword" ) ! Default port 3306 is OK
result$ = Control( SqlLun, "DISCONNECT" )
Close( SqlLun )

result$ contains a string reporting whether each of these controls were executed successfully. The disconnect control takes no arguments. It disconnects from the current server/database, and readies the SQL device for another connection. You must disconnect a connected SQL device before you try to connect to another database/server.

Error Reporting

Most errors you need to know about are contained in the return value from the Control() function, preceded by "-ERR" for convenience. If more information is needed, some times you can use the GETLASTERROR control to get more details about the last internal error encountered:

Open( SqlLun ) "SQL"
result$ = Control( SqlLun, "CONNECT sampledb localhost testuser testpassword" ) ! Default port 3306 is OK

! Check for error
if sub( result$, 1, 1 ) EQ "-" error$ = Control( SqlLun, "GETLASTERROR" )

! Report errors
print result$; " >> "; error$

Close( SqlLun )

Issuing SQL statements to the database/server

There are two categories of statements that can be sent to the SQL server. The first are queries and commands that return data in tables (SELECT, DESCRIBE, etc.), the second are manipulation and other commands that return simple results (INSERT, CREATE, etc.). Because these 2 categories of statements are ideologically different and they return different kinds of results, we have created 2 different controls for issuing SQL statements to the database/server. SQLEXECUTE must be used to issue all non-query simple result statements, whereas SQLQUERY must be used to issue queries and statements that return tables as results. You must use the correct control with its corresponding SQL statement, or you will not be able to access the results of your SQL statement.

Non-Query SQL Controls with SQLEXECUTE

Let's say you want to update your customer table so that all customers in a certain zipcode have a new shipping price. Because SQL's UPDATE does not return a table of data as a result (like SELECT would), you must use the SQLEXECUTE control:

Open( SqlLun ) "SQL"
result$ = Control( SqlLun, "CONNECT sampledb localhost testuser testpassword" ) ! Default port 3306 is OK

! Use SQLEXECUTE for non-query-type SQL statements
result$ = Control( SqlLun, "SQLEXECUTE UPDATE custtable SET shipping=9.99 WHERE zipcode=98765" )

! Can only get number of rows affected for SQLEXECUTE type statements
result$ = Control( SqlLun, "GETROWSAFFECTED" )

result$ = Control( SqlLun, "DISCONNECT" )
Close( SqlLun )

In this case, after the SQLEXECUTE control is issued, result$ contains a string reporting whether the SQL statement was executed successfully. If the UPDATE was issued to the database/server successfully, you can then use the GETROWSAFFECTED control to get the number of rows in the table that were affected by this update. In this case, result$ would contain the number of customers (rows) in the 'custtable' table that had the zipcode 98765.

Insert/Update/Replace with Write()

If you know the SQL syntax, using SQLEXECUTE gives you the most control when writing to a table. You are able to build the SQL string yourself, and get the most detailed about how you would like the database to perform.

The other option is to use the Write() function with a format. In order to do this, you must define your format list with the correct variable types. This variable list must correspond to the columns in the SQL table. That is, for each column in the table, you must provide a value in the format list, they must be in the correct column order, and they must match the data type (numeric or string types). You must also specify which table you want to write to prior to inserting the data using the SETWRITETABLE control. See below for an example:

SqlFmt: format id; intdata; strdata$

! Open and connect...

result$ = Control( SqlLun, "SETWRITETABLE example" )

id = 47
intdata = -12
strdata$ = "test string"

write( SqlLun, SqlFmt) excp=err

Essentially, the Write() function is a wrapper around the SQLEXECUTE statement. Internally, we take the format list, and build an SQL statement to write the values into the specified table. The write is performed using the SQL "REPLACE" method, which performs a "DELETE" then "INSERT" if the row already exists (based on the table's keys and unique fields). So for example, the above Write() would be equivalent to the following SQLEXECUTE command:

result$ = Control( SqlLun, "SQLEXECUTE REPLACE INTO example VALUES( 47, -12, 'test string' )" )

Specify the SQL table with SETWRITETABLE

The SETWRITETABLE control specifies which table the Write() function should use. The function does not check that the table exists in the SQL database. Once you perform the Write(), if the table does not exist, the error will be reported in SQLGETLASTERROR. The syntax is as follows:

result$ = Control( SqlLun, "SETWRITETABLE example" )

Using a default value with Write()

In SQL, you can specify that a column should be the default value by specifying "default" (without quotes) in the values list. This can be done easily using the SQLEXECUTE statement like this:

result$ = Control( SqlLun, "SQLEXECUTE REPLACE INTO example VALUES( default, -12, 'default id' )" )

for a default id column or for the default string column:

result$ = Control( SqlLun, "SQLEXECUTE REPLACE INTO example VALUES( 47, -12, default )" )

Unfortunately you cannot specify "default" with the Write() function, because for numerics it would be the wrong type, and for string columns, it would just write the string "default" instead of the actual default value.

Instead, we use the hex value 0xFF ("@FF@") to specify that the default value. This also means that in some cases, you will need to have a separate format list to specify the default value. For example:

SqlFmtDefID: format default$; intdata; strdata$

! Open and connect...

result$ = Control( SqlLun, "SETWRITETABLE example" )

default$ = "@FF@"
intdata = -12
strdata$ = "test string"

write( SqlLun, SqlFmtDefID) excp=err

Query SQL Controls

Now let's say you want to issue a query on the 'custtable' table in order to get a list of customers that are in California and have free shipping. Because you are issuing a query, you must use the SQLQUERY control. This tells the device that you are issuing a query, and that we will be asking for results in the form of a table.

Open( SqlLun ) "SQL"
result$ = Control( SqlLun, "CONNECT sampledb localhost testuser testpassword" ) ! Default port 3306 is OK

! Use SQLQUERY for query-type SQL statements
result$ = Control( SqlLun, "SQLQUERY SELECT custname,zipcode,phone FROM custtable WHERE state='CA' AND shipping=0.00" )

result$ = Control( SqlLun, "DISCONNECT" )
Close( SqlLun )

In this case, after the SQLQUERY control is issued, result$ contains a string reporting whether the SQL statement was executed successfully. There are a few different controls used to retrieve the results from an SQL query, where each one presents the results to you in a different way.

Query Results by Read()

After processing an SQL query, the results of the query are stored in a table. This table has a row for each result, and the columns represent the fields you want data for. The most straight forward method of retrieving your results from a query is to use the Read() function. Read() iterates through the result table, filling in the variables defined by the format statement with the corresponding columns of the result row. The variables are index by position only, not by name - this means that you must define your format statement to be in the same order as the columns you selected in the SQLQUERY. Here is how you might get your results using the Read() function:

DataFmt: format name$;zipcode;phonenumber$

!...open and connect...

result$ = Control( SqlLun, "SQLQUERY SELECT custname,zipcode,phone FROM custtable WHERE state='CA' AND shipping=0.00" )

Do
  ! Read in the next row, exception on end of result table
  Read( SqlLun, DataFmt ) Excp=EOF

  ! do something with the results...
  Print "name: "; name$; ", zip code: "; zipcode; ", phone number: "; phonenumber$
loop

EOF:
! No more rows

!...disconnect and close...

When the Read() function has exhausted all of the rows in the result table, it will return an exception for END_OF_FILE, which you can use to determine when you've finished iterating through your results.

Query Results by GETNEXTROW

The GETNEXTROW control processes the result table one row at a time, similar to the Read() function. But instead of populating variables in a format statement, GETNEXTROW returns a single string with all of the fields delimited by a single character. This way you can process each row however you like, and skip the process of copying the fields into variables if you don't need to. The control is used like this:

!...open and connect...

result$ = Control( SqlLun, "SQLQUERY SELECT custname,zipcode,phone FROM custtable WHERE state='CA' AND shipping=0.00" )

Do
  ! Read in the next row, exception on end of result table
  result$ = Control( SqlLun, "GETNEXTROW &" )

  ! Check for end of result table
  if sub( result$, 1, 1 ) EQ "-" GOTO EOF

  ! do something with the results...
  Print result$
loop

EOF:
! No more rows

!...disconnect and close...

The above code gets each row in a single string where the fields are delimited by '&'. Once the rows in the result table have been exhausted, the next GETNEXTROW will return an error string of the form "-ERR". This means that you can simply check for a '-' at the start of the result string to see if you have processed all rows.

Query Results by GETALLROWS

The GETALLROWS control, unlike Read() and GETNEXTROW, returns the entire result set from your query in a single string. This is useful at times when you don't necessarily need to process all of the results (for instance if you are just piping the results to a file), or if you don't need to process each row separately. The GETALLROWS control looks like this:

!...open and connect...

result$ = Control( SqlLun, "SQLQUERY SELECT custname,zipcode,phone FROM custtable WHERE state='CA' AND shipping=0.00" )

! Read all of the results into a single string buffer - row delimiter is carriage return, field delimiter is ','
result$ = Control( SqlLun, "GETALLROWS @0D@ ," )

!...disconnect and close...

The above code copies all of the rows and columns of your result table into a single string. The rows are delimited by a hex 0D (carriage return), and the fields in each row are delimited by a comma ','. You could now print result$ to a file, and have a list of all customers with their zipcodes and phone numbers that satisfy the search criteria. Be aware that there may be buffer size limitations with this control - please see the implementation details and known issues section for more details.

Advanced: Dealing with Very Large Result Sets

You should never need to use the functionality described in this section. For the vast majority of SQL queries and result processing (and i mean virtually all of it), the functions and controls described above should be more than adequate. You could stop reading this guide right now and never run into a problem that this section solves. Now that you understand just how seldom it should be used, you can read on and learn how very large result sets can be processed more efficiently.

First of all, it should be clear that we are talking about very large result sets. When the MySql server returns the result table normally, it is transferred in a single chunk of data (usually stored in a vector or other standard C++ container). This only becomes an issue if you select a large number of rows with large data fields (for instance, a million rows at 1k bytes per row would require ~100GB of local storage). Now, most of the time there are ways to trim down your result table - selecting only the columns you need instead of SELECT *, being more limiting with your WHERE clause, or even specifying a LIMIT in your query. In the rare instance that you cannot pare down your result table any more, we provide an alternate internal storage method.

By configuring the device to use the ROWBYROW result method, instead of transferring the entire result set back in a single chunk, the result table is retrieved on demand, one row at a time. That is, once you have submitted a query, each time a Read() or GETNEXTROW is performed, the SQL device goes out to the MySql server and retrieves the next row:

Open( SqlLun ) "SQL"
result$ = Control( SqlLun, "CONNECT sampledb localhost testuser testpassword" ) ! Default port 3306 is OK

! Set the device to get each row on demand as we request them
result$ = Control( SqlLun, "SETRESULTMETHOD ROWBYROW" )

! Use SQLQUERY for query-type SQL statements
result$ = Control( SqlLun, "SQLQUERY SELECT custname,zipcode,phone FROM custtable WHERE state='CA' AND shipping=0.00" )

Do
  ! Read in the next row, exception on end of result table
  Read( SqlLun, DataFmt ) Excp=EOF

  ! do something with the results...
  Print "name: "; name$; ", zip code: "; zipcode; ", phone number: "; phonenumber$
loop

EOF:
! No more rows

result$ = Control( SqlLun, "DISCONNECT" )
Close( SqlLun )

So each time the Read() function is executed, the device connects to the MySql server and retrieves the next row. The benefit of this is that you don't have to store all of the results on the local workstation, but at the cost of having to connect to the database and retrieve the next row each time you Read(). Due to the nature of the ROWBYROW result method, you cannot perform the GETALLROWS control (if you wanted to, why are you using the ROWBYROW method in the first place?). Also, due to limitations in the connectivity of the MySql server, you must process all of the rows in a result table before you perform another query. That is, you must Read() or GETNEXTROW until there are no more rows left. If you try to perform another query first, you will receive an error.

Lastly, in order to set the SQL device back to its original result method, perform the SETRESULTMETHOD control with the argument ALLATONCE to specify that you would like to get all of the result table back at once from the server. The SQL device defaults to this method, so a newly opened SQL device will initialize using this method.

Reference

The Sql control functions are broken into three categories: Management controls deal with the database connection and configuration, Non-Query Sql controls deal with non-query Sql statements, and Query Sql controls deal with all query-type Sql statements. The Non-Query Sql controls are designed to deal with executing all Sql statements that don't perform queries or return result sets. Sql commands like INSERT, CREATE, and UPDATE all perform some action in Sql, but return a simple result like the number of rows affected. Conversely, the Query Sql controls are designed to deal with executing all the Sql statements that perform queries and return result sets. Basically SELECTS are issued using this category, and controls are provided to get rows and results out of the returned table.

The following sections show the syntax for all of the Sql Control strings:

Management Controls

These allow the programmer to control the database connection as well as any configuration options available for the Sql device.

Connect <database> <server> <user> <password> <port>
- Connects to the database on the server with the parameters specified. If your database is on the local machine, just use localhost. The final three parameters are optional (port defaults to the normal Sql port - 3306, and the user/password default to blank). Although hopefully you don't have any users with blank passwords!
Disconnect
- Disconnects from the database and server.
SetResultMethod <ALLATONCE/ROWBYROW>
- Specifies whether the queries issued should return results using the ALLATONCE or ROWBYROW methods - more details above.
GetLastError
- Returns a string containing details on the last mysql++ error.

Non-Query Sql Controls

These functions are used to issue and report results for Sql commands that are not queries (i.e. INSERT, CREATE, UPDATE, etc.).

SqlExecute <sql_string>
- Issues a non-query Sql command to the database and server. SELECT should not be used with this control.
GetRowsAffected
- Returns the number of rows that were affected by the last SqlExecute control.

Query Sql Controls

These functions are used to issue Sql queries and retrieve the resulting tables (i.e. SELECT statements). NOTE: result strings from Control statements are restricted to 8192 bytes! If your row or result table is longer than this, you will get a buffer overflow error.

SqlQuery <sql_string>
- Issues a query Sql command to the database and server. You should only use Sql commands that return result tables with this control.
GetNextRow <delimiter>
- Returns the next row as a string delimited by the parameter (single character only).
GetAllRows <row delimiter> <field delimiter>
- (Can only be used on ALLATONCE queries!) Returns all of the result rows as a string delimited by the row and field delimiters (single characters only).

Future Controls?

Some day I think that it would be nice to abstract out the Sql commands by providing functions like these:

QueryOn( "customer_id", "customer_name" )
LimitOn( "customer_state", "NY" )
LimitOn( "customer_since", "1999-12-31" )
SortBy( "customer_id" )

resultSet = SendStatement()

where we can build the Sql queries based on the tables and constraints provided.

Though once we start moving into JOINs and the like, this kind of interface seems hard to implement. My impression is that anybody doing any non-trivial SQL work inside IB will probably have to have an understanding of SQL syntax and implementation.

Known Issues

Maximum Result Buffer Size (8192)

Right now the maximum length of a result string returned from the Control() function is 8192 (note this is for all Control() calls, not specific to SQL). This means that if you perform a GETALLROWS or GETNEXTROW and the result string is longer than 8192, you will get a buffer overflow exception. We are working to resolve this, so in the meantime, please tailor your retrieval methods accordingly. Either place an appropriate LIMIT on your queries, or stick to the Read() function.

Binary Data (blobs)

Binary data is inherently tricky to work with in string format because some characters (single/double quotes, backslashes, and NULL) are considered special. In order to insert these into mysql using the SQL driver, you will need to "escape" these characters by prepending backslashes (and for NULL replacing with '\0'). If you do not escape these characters, and they are contained within your field data, the mysql driver will interpret the wrong data and your results will be undefined at best (possibly crash the system).

We recommend using base64 and base256 to encode your binary data when inserting into mysql. Please keep in mind the above issue that buffer sizes are still limited to 8192 bytes.

SQLEXECUTE vs. SQLQUERY Mixup

An obvious question you might ask is, "what happens when you perform a query with SQLEXECUTE or a non-query (for instance INSERT) with SQLQUERY?" "Undefined" would be the general answer, but here is a little more detail for the most likely scenarios. Because we do not check your SQL syntax (and the underlying library only does enough to make sure it is processing a valid statement, not the correct kind of statement), your statement will be processed by the MySql database.

For SQLQUERY, this means that if you perform a data modification statement, the data WILL be modified! Any controls used to handle results will not work, however, because it will be checking an invalid result type. Therefore you cannot rely on your program to know whether the control was successful.

For SQLEXECUTE, this means that if you perform a query, the query WILL be processed (thus theoretically performing any stored procedures, triggers, etc) on the server. The result table, however, will not be returned to the local workstation, effectively locking the library because you cannot perform another SQL operation until this one has been completed. You will have to disconnect and reconnect to the server in order to perform any more SQL operations.

Unfortunately there is no easy way to solve this issue without hard coding checks for all SQL syntax to ensure you are using the correct control. Instead we leave it up to you to know which control to use, and to use them correctly. If you run across any other undefined behavior, please post it here or on the discussion page.

Implementation Details

This section is only for those of you curious about how the SQL device is implemented. For those of you that are more familiar with MySql, please read on and follow the links to get a better idea of how the device works. If you see some extended functionality that we can implement, please let us know. Just remember that you do not need to understand anything in this section in order to be able to use the SQL device, it is solely to satisfy any curiosity you might have.

The SQL driver utilizes the MySQL++ library to interface with the MySQL server. The library is compiled and linked into our own library, which provides the interface to the IB applications. Currently the MySQL++ library as well as our library is compiled and linked dynamically (.dll). This means that both dlls are required to run comet with Sql support (which are provided behind-the-scenes starting with the v407 workstation install).

The MySQL++ library and documentation can be found at the MySQL++ Website