The following article has been published on the CodeGuru site.
(last update: 11 Nov 2001)

Environment: Windows 2000/SP2, VC++ 6.0/SP3
The world of database programming includes names like Oracle, Informix, etc. If you are an MFC programmer, maybe you are familiar with
terms like ODBC, DAO, Access, etc. But what if you need a smart system to handle your data quickly ? The Berkeley DB is a library you can
use to handle your data without resort to the well known names of database programming and this article shows you how to use it into your
Win32/MFC projects through a C++ wrapper.
The sample application presented here requires a minimum knowledge of the database programming topical and some basic notions about the MFC.
The AccessLog sample presented here is an utility which uses the Berkeley DB library. That is, it's only a sample, so do not expect a general database system with SQL support, sophisticated data conversion, triggers, etc. The only purpose of this article is to show you how to use a very reliable database system under Win32 through a C++ wrapper. All the C++ classes presented here to wrap the Berkeley DB library are only a facility to use it in a XBase manner, and does not pretend to be the final solution for the database programming topical.
Most of the Internet hosts are configured to track its activity into a log file. The same applies to the web (http) servers, which
generates a log file containing all the requests coming from the users (the access log file).
Such kind of log contains all the basic infos about the user's request, like the ip address, the name of the requested object, its size,
the return code, etc.
The following line shows you a tipical entry of the access log file, where the real ip address has been changed with the "nnn.nnn.nnn.nnn"
sequence:
nnn.nnn.nnn.nnn - - [03/Oct/2001:10:18:21 -0400] "GET / HTTP/1.0" 200 27870 "-" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98;)"
As you can see, each entry (record) of the access log contains the following fields:
| value | description |
| nnn.nnn.nnn.nnn | The ip address of the user who makes the request |
| [03/Oct/2001:10:18:21 -0400] | The date and time when the server receives the request. The GMT specifier for the time (-400) is relative to the time zone of the server. |
| GET / HTTP/1.0 | The request received by the server, usually a GET command. This field contains the HTTP command (GET, HEAD, POST, etc.), the requested object (the url location) and the version of HTTP protocol supported by the client. |
| 200 | The return code (200=Ok, 404=Not found, etc.). |
| 27870 | The size of the requested object (in bytes). |
| - | The "Referer" url, which indicates where the user comes from. |
| Mozilla/4.0 (compatible; MSIE 5.0; Windows 98;) | The "UserAgent" field, which identifies the client used by the user. |
The main purpose of the AccessLog sample app is to illustrate how to use the Berkeley DB library in the everyday life. Do not expect a
sophisticated C++ wrapper to transform the Berkeley DB library into a general SQL database. The C++ wrapper presented here is only a
sample code to give an XBase fashion to the Berkeley DB. If your technical background includes languages like the legendary Clipper (from
Nantucket), sure you will enjoy with that wrapper. On the other side, if you are looking for some SQL, you must resort to tools like the
well known MentorSQL experiment (but do not ask me, because the source code is no more available and I do not know where to find it).
The Berkeley DB library uses a hierarchy approach, so to use it in a XBase (relational) manner we need to make some adjustments. Our C++
wrapper (let me call it the CBase interface) allows you to define an index for each field of your table. The mechanism used by our
CBase wrapper is very simple. Each table you define through the CBase interface contains an 'hidden' field used as the primary key. This
field contains the unique key which identifies the record. Each index you defines for the main table contains the value of the secondary
and primary key.
Suppose that the Table1 is your main table:
| primary key | your fields (name, email, age, address, etc.) here |
| 0000000001 | Luca, lpiergentili@yahoo.com, etc. |
| 0000000002 | Pasqualino, pasqualino@settebellezze.com, etc. |
| 0000000003 | Giacomino, giacomino@bellammare.com, etc. |
| etc. |
| name field (secondary key) | primary key from the table Table1 |
| Luca | 0000000001 |
| Pasqualino | 0000000002 |
| Giacomino | 0000000003 |
| etc. |
| email field (secondary key) | primary key from the table Table1 |
| lpiergentili@yahoo.com | 0000000001 |
| pasqualino@settebellezze.com | 0000000002 |
| giacomino@bellammare.com | 0000000003 |
| etc. |
struct ROW {
int num; // field position (0 based)
int ofs; // field offset (0 based)
char* name; // field name
char type; // field type
int size; // field size
int dec; // decimals
char* value; // pointer to the field content
unsigned long flags; // filters
};
for the table:
struct TABLE {
char filename[_MAX_PATH+1]; // table name
int totfield; // number of fields
ROW* row; // array for field definition, must alloc at run-time
int totindex; // number of indexes
INDEX* index; // array for indexed definition, must alloc at run-time
TABLE_STAT stat; // current state
};
struct DATABASE {
TABLE table; // table definition
};
and for the index:
struct INDEX {
char filename[_MAX_PATH+1]; // index filename
char* name; // index name
char* fieldname; // the name of the (table) field used as the key
int fieldnum; // the number of the (table) field used as the key (0 based)
};
The CBerkeleyDB class also contains all the basic methods required to access the table, like Open(), Create(), Close(), Insert(), Delete(),
etc. To retrieve the record, the CBerkeleDB class defines the GetFirst...Last() methods, while all the (internal) primary key stuff is handled
through the GetPrimary...() functions. The CBerkeleyDB class does not contains pure virtual methods, like SomeMethod() = 0, but think the
CBerkeleyDB class like a base class.#include "CBase.h" #include "CTable.h"Next, we need to define the table name (see the LOG_TABLE macro) and the length of each field of the record:
#define LOG_TABLE "logtable" #define LOG_IP_LEN MAX_URL #define LOG_DATE_LEN 32 #define LOG_GET_LEN MAX_URL #define LOG_CODE_LEN 5 #define LOG_SIZE_LEN 10 #define LOG_REFERER_LEN MAX_URL #define LOG_USERAGENT_LEN MAX_URL #define LOG_RECORD_LENGTH (LOG_IP_LEN + LOG_DATE_LEN + LOG_GET_LEN + LOG_CODE_LEN + LOG_SIZE_LEN + LOG_REFERER_LEN + LOG_USERAGENT_LEN)Also, we need to define the id for all the indexes:
#define LOG_IDX_IP 0 #define LOG_IDX_GET 1 #define LOG_IDX_REFERER 2 #define LOG_IDX_USERAGENT 3 #define LOG_IDX_DATE 4Now we can start with the CLogDatabase class definition:
class CLogTable : public CTable
{
private:
// record definition
struct RECORD {
char ip[LOG_IP_LEN+1];
char date[LOG_DATE_LEN+1];
char get[LOG_GET_LEN+1];
int code;
long size;
char referer[LOG_REFERER_LEN+1];
char useragent[LOG_USERAGENT_LEN+1];
};
CBASE_TABLE* table_struct; // pointer to the table struct definition, see into CLogDatabase.cpp
CBASE_INDEX* idx_struct; // pointer to the index struct definition, see into CLogDatabase.cpp
RECORD record; // the instance of our record
char m_szRecord[LOG_RECORD_LENGTH+1]; // internally used to store the entore record
char m_szTableName[_MAX_PATH+1]; // the name of the table
char m_szTablePath[_MAX_PATH+_MAX_FNAME+1]; // the pathname for the table
char m_szIndexIp[_MAX_PATH+_MAX_FNAME+1]; // the pathname for the index referring to the ip field
char m_szIndexGet[_MAX_PATH+_MAX_FNAME+1]; // the pathname for the index referring to the get field
char m_szIndexReferer[_MAX_PATH+_MAX_FNAME+1]; // the pathname for the index referring to the referer field
char m_szIndexUserAgent[_MAX_PATH+_MAX_FNAME+1]; // the pathname for the index referring to the user agent field
char m_szIndexDate[_MAX_PATH+_MAX_FNAME+1]; // the pathname for the index referring to the date field
public:
// ctor/dtor
CLogTable(LPCSTR lpcszTableName = NULL,LPCSTR lpcszDataPath = NULL,BOOL bOpenTable = FALSE);
virtual ~CLogTable();
// must define the pure virtual members of the CTable class
inline const char* GetClassName(void) {return("CLogTable");}
inline const char* GetStaticTableName(void) {return(LOG_TABLE);}
inline const char* GetTableName(void) {return(m_szTableName);}
inline const char* GetTablePathName(void) {return(m_szTablePath);}
inline const CBASE_TABLE* GetTableStruct(void) {return(table_struct);}
inline const CBASE_INDEX* GetIndexStruct(void) {return(idx_struct);}
inline const int GetRecordLength(void) {return(LOG_RECORD_LENGTH);}
const char* GetRecordAsString(void);
inline void ResetMemvars(void) {memset(&record,'\0',sizeof(record));}
void GatherMemvars(void);
void ScatterMemvars(BOOL = TRUE);
// methods to retrieve the field's content
inline const char* GetField_Ip(void) {return(record.ip);}
inline const char* GetField_Date(void) {return(record.date);}
inline const char* GetField_Get(void) {return(record.get);}
inline int GetField_Code(void) {return(record.code);}
inline long GetField_Size(void) {return(record.size);}
inline const char* GetField_Referer(void) {return(record.referer);}
inline const char* GetField_UserAgent(void) {return(record.useragent);}
// methods to fill the field
inline void PutField_Ip(const char* value) {strcpyn(record.ip,value,sizeof(record.ip));}
inline void PutField_Date(const char* value) {strcpyn(record.date,value,sizeof(record.date));}
inline void PutField_Get(const char* value) {strcpyn(record.get,value,sizeof(record.get));}
inline void PutField_Code(int value) {record.code = value;}
inline void PutField_Size(long value) {record.size = value;}
inline void PutField_Referer(const char* value) {strcpyn(record.referer,value,sizeof(record.referer));}
inline void PutField_UserAgent(const char* value) {strcpyn(record.useragent,value,sizeof(record.useragent));}
};
The CLogTable class inherits all the methods to handle our data from the CTable class (which contains the CBase object), and if you look
into the CLogDatabase.cpp file, you can see that the CLogTable class only defines the structure for the table:
static CBASE_TABLE table[] = {
{"IP", 'C', LOG_IP_LEN, 0},
{"DATE", 'C', LOG_DATE_LEN, 0},
{"GET", 'C', LOG_GET_LEN, 0},
{"CODE", 'N', LOG_CODE_LEN, 0},
{"SIZE", 'N', LOG_SIZE_LEN, 0},
{"REFERER", 'C', LOG_REFERER_LEN, 0},
{"USERAGENT", 'C', LOG_USERAGENT_LEN, 0},
{NULL, 0, 0, 0}
};
for the index:
static CBASE_INDEX idx[] = {
{NULL, "IDX_IP", "IP"},
{NULL, "IDX_GET", "GET"},
{NULL, "IDX_REFERER", "REFERER"},
{NULL, "IDX_USERAGENT", "USERAGENT"},
{NULL, "IDX_DATE", "DATE"},
{NULL, NULL, NULL}
};
and implements the required methods, like GatherMemvars(), ScatterMemvars() and GetRecordAsString().CLogDatabaseService LogService(szTableName,szTablePath,TRUE); // defines an object for complex operations on the table CLogTable* pLogTable = LogService.GetTable(); // obtain the pointer to the CLogTable object to use it directlyI know that the CBase C++ wrapper cannot be fully and clearly explained in a short, so if you like the wrapper and are thinking about using it into your apps, you need to explore all the source code. Comments are not in english, but I think the code is clear enough and the use of the hungarian and MFC notations will help you.
The AccessLog sample app requires the Berkeley DB library which is not included with the source code of the project due to its size (over
1 MB). If you want to use the AccessLog app or the CBase C++ wrapper, you must first download the Berkeley DB library version 2.7.7 from the
sleepycat site and install it. Note that I wrote the C++ wrapper back into 1998 and
this is the reason for which the CBase interface uses the Berkeley DB version 2.7.7 (the version available at those times). The CBase interface
presented here does not work with the latest version of the Berkeley DB library (3.3.11), due to the introduced from the 2.7.7 version.
The AccessLog project assumes that you install the Berkeley DB library into the \BerkeleyDB directory. Also, the CBrekeleyDB.h header file
contains a #pragma directive to automatically include the reference to the required DLL (BerkeleyDB.dll):
#ifdef _DEBUG #pragma comment(lib,"BerkeleyDB.d.lib") #else #pragma comment(lib,"BerkeleyDB.lib") #endifso you must modify the original Berkeley DB project file and specify the BerleleyDB.dll name for the output dll file (BerkeleyDB.d.dll for the debug version).
As usual, any kind of ideas, suggestions or enhancements are always welcome.
Luca Piergentili
lpiergentili@yahoo.com
http://www.geocities.com/lpiergentili/