SIGMOD Contest 2013
Loading...
Searching...
No Matches
Macros | Typedefs | Enumerations | Functions
core.h File Reference

Go to the source code of this file.

Macros

#define MAX_DOC_LENGTH   (1<<22)
 Maximum document length in characters.
 
#define MAX_WORD_LENGTH   31
 Maximum word length in characters.
 
#define MIN_WORD_LENGTH   4
 Minimum word length in characters.
 
#define MAX_QUERY_WORDS   5
 Maximum number of words in a query.
 
#define MAX_QUERY_LENGTH   ((MAX_WORD_LENGTH+1)*MAX_QUERY_WORDS)
 Maximum query length in characters.
 

Typedefs

typedef unsigned int QueryID
 Query ID type.
 
typedef unsigned int DocID
 Document ID type.
 

Enumerations

enum  MatchType { MT_EXACT_MATCH , MT_HAMMING_DIST , MT_EDIT_DIST }
 Matching types: More...
 
enum  ErrorCode { EC_SUCCESS , EC_NO_AVAIL_RES , EC_FAIL }
 Error codes:
More...
 

Functions

ErrorCode InitializeIndex ()
 Called only once at the beginning of the whole test.
 
ErrorCode DestroyIndex ()
 Called only once at the end of the whole test.
 
ErrorCode StartQuery (QueryID query_id, const char *query_str, MatchType match_type, unsigned int match_dist)
 Add a query (associated with matching type) to the active query set.
 
ErrorCode EndQuery (QueryID query_id)
 Remove a query from the active query set.
 
ErrorCode MatchDocument (DocID doc_id, const char *doc_str)
 Push a document to the server.
 
ErrorCode GetNextAvailRes (DocID *p_doc_id, unsigned int *p_num_res, QueryID **p_query_ids)
 Return the next available active queries subset that matches any previously submitted document, sorted by query IDs.
 

Macro Definition Documentation

◆ MAX_DOC_LENGTH

#define MAX_DOC_LENGTH   (1<<22)

Maximum document length in characters.

◆ MAX_QUERY_LENGTH

#define MAX_QUERY_LENGTH   ((MAX_WORD_LENGTH+1)*MAX_QUERY_WORDS)

Maximum query length in characters.

◆ MAX_QUERY_WORDS

#define MAX_QUERY_WORDS   5

Maximum number of words in a query.

◆ MAX_WORD_LENGTH

#define MAX_WORD_LENGTH   31

Maximum word length in characters.

◆ MIN_WORD_LENGTH

#define MIN_WORD_LENGTH   4

Minimum word length in characters.

Typedef Documentation

◆ DocID

typedef unsigned int DocID

Document ID type.

◆ QueryID

typedef unsigned int QueryID

Query ID type.

Enumeration Type Documentation

◆ ErrorCode

enum ErrorCode

Error codes:

Enumerator
EC_SUCCESS 

Must be returned by each core function unless specified otherwise.

EC_NO_AVAIL_RES 

Must be returned only if there is no available result to be returned by GetNextAvailRes().

That is, all results have already been returned via previous calls to GetNextAvailRes().

EC_FAIL 

Used only for debugging purposes, and must not be returned in the final submission.

◆ MatchType

enum MatchType

Matching types:

Enumerator
MT_EXACT_MATCH 

Two words match if they are exactly the same.

MT_HAMMING_DIST 

Two words match if they have the same number of characters, and the number of mismatching characters in the same position is not more than a specific threshold.

MT_EDIT_DIST 

Two words match if one of them can can be transformed into the other word by inserting, deleting, and/or replacing a number of characters.

The number of such operations must not exceed a specific threshold.

Function Documentation

◆ DestroyIndex()

ErrorCode DestroyIndex ( )

Called only once at the end of the whole test.

Can be used for releasing all memory used to index active queries. The time spent in this function will not affect the score of the submission.

◆ EndQuery()

ErrorCode EndQuery ( QueryID query_id)

Remove a query from the active query set.

Parameters
[in]query_idThe integral ID of the query. This function will not be called twice with the same query ID.
Returns
ErrorCode
  • EC_SUCCESS if the query was unregistered successfully

◆ GetNextAvailRes()

ErrorCode GetNextAvailRes ( DocID * p_doc_id,
unsigned int * p_num_res,
QueryID ** p_query_ids )

Return the next available active queries subset that matches any previously submitted document, sorted by query IDs.

The returned result must depend on the state of the active queries at the time of calling MatchDocument().

Parameters
[out]*p_doc_idA document ID that has not been returned before. You can choose to return the results of any document that has not been returned before.
[out]*p_num_resThe number of active queries that matched the document *p_doc_id.
[out]*p_query_idsAn array of the IDs of the *p_num_res matching queries, ordered by the ID values. This array must be allocated by this core library using malloc(). This array must not be freed by the core library, since it will be freed by the testing benchmark. If *p_num_res=0, this array must not be allocated, as it will not be freed by the testing benchmark in that case. Allocating this array using "new" is not acceptable. In case of *p_num_res is not zero, The size of this array must be equal to "(*p_num_res)*sizeof(QueryID)" bytes.
Returns
ErrorCode
  • EC_NO_AVAIL_RES if all documents have already been returned by previous calls to this function
  • EC_SUCCESS results returned successfully

◆ InitializeIndex()

ErrorCode InitializeIndex ( )

Called only once at the beginning of the whole test.

Performs any required initializations.

◆ MatchDocument()

ErrorCode MatchDocument ( DocID doc_id,
const char * doc_str )

Push a document to the server.

Parameters
[in]doc_idThe integral ID of the document. This function will not be called twice with the same document ID.
[in]doc_strA null-terminated string representing the document. It consists of a space separated set of words. The length of any word will be at least MIN_WORD_LENGTH characters and at most MAX_WORD_LENGTH characters. The length of any document will not exceed MAX_DOC_LENGTH characters. "doc_str" contains at least one non-space character. "doc_str" contains only lower case letters from 'a' to 'z' and space characters.
Returns
ErrorCode
  • EC_SUCCESS if the document was added successfully

◆ StartQuery()

ErrorCode StartQuery ( QueryID query_id,
const char * query_str,
MatchType match_type,
unsigned int match_dist )

Add a query (associated with matching type) to the active query set.

Parameters
[in]query_idThe integral ID of the query. This function will not be called twice with the same query ID.
[in]query_strA null-terminated string representing the query. It consists of a space separated set of words. The length of any word will be at least MIN_WORD_LENGTH characters and at most MAX_WORD_LENGTH characters. The number of words in a query will not exceed MAX_QUERY_WORDS words. "query_str" contains at least one non-space character. "query_str" contains only lower case letters from 'a' to 'z' and space characters.
[in]match_typeThe type of mechanism used to consider a query as a match to any document, as specified in MatchType enumeration.
[in]match_distThe hamming or edit distance (according to "match_type") threshold used as explained in MatchType enumeration. This parameter must be equal 0 for exact matching. The possible values of this parameter are 0,1,2,3. A query matches a document if and only if: for each word in the query, there exist a word in the document that matches it under the "match_type" and "match_dist" constraints. Note that the "match_dist" constraint is applied independently for each word in the query.
Returns
ErrorCode
  • EC_SUCCESS if the query was registered successfully