Gene Zhang

Gene Zhang

United States
6K followers 500+ connections

Activity

Join now to see all activity

Licenses & Certifications

Publications

  • SPAX: Simple Path based XML Data Storage and XPath Evaluation

    EDBT/ICDT 2009 Joint Conference

    An XPath-based XML query is typically applied on multiple XML documents, which are stored in an XML repository either in a native XML database or in an XML-typed column of a table in a relational database. Such XML queries are usually used to access pieces of information, which are small in size and scatteredly stored in most cases, from the entire XML documents. Taking this characteristic into consideration, this paper presents an XML data management system, called SPAX, which adopts a Simple…

    An XPath-based XML query is typically applied on multiple XML documents, which are stored in an XML repository either in a native XML database or in an XML-typed column of a table in a relational database. Such XML queries are usually used to access pieces of information, which are small in size and scatteredly stored in most cases, from the entire XML documents. Taking this characteristic into consideration, this paper presents an XML data management system, called SPAX, which adopts a Simple Path clustering storage solution. Accordingly, a novel XPath evaluation approach is introduced. The system can avoid retrieving unneeded data into memory, reduce I/O times, and thereby enhance the XPath evaluation. Extensive experimental results reported in this paper demonstrate that the approach is promising and can achieve significant performance improvements.

    Other authors
  • QuickXScan: Efficient Streaming XPath Evaluation

    International Conference on Internet Computing '06

    Many XML applications over the Internet favor high-performance single-pass streaming XPath evaluation. Finite automata-based algorithms suffer from potentially combinatorial explosion of dynamic states for matching descendant axes. We present QuickXScan for streaming evaluation of XPath queries containing child and descendant axes with complex predicates. Using a tree representation for an XPath query, it employs a matching grid, a compact tree of interrelated stacks as in the holistic twig…

    Many XML applications over the Internet favor high-performance single-pass streaming XPath evaluation. Finite automata-based algorithms suffer from potentially combinatorial explosion of dynamic states for matching descendant axes. We present QuickXScan for streaming evaluation of XPath queries containing child and descendant axes with complex predicates. Using a tree representation for an XPath query, it employs a matching grid, a compact tree of interrelated stacks as in the holistic twig join algorithms, to represent the matches and their relationships. QuickXScan fully utilizes transitivity for matching, thus reduces the number of active states to the query size in the worst case. It also evaluates expressions incrementally using propagation and iterative rules, and produces result sequences without the need for duplicate removal. QuickXScan is practical and highly efficient.

    Other authors
    See publication
  • DB2 goes hybrid: Integrating native XML and XQuery with relational data and SQL

    IBM Systems Journal

    Other authors
  • Building a scalable native XML database on infrastructure for relational databases

    International Workshop XIMEP’05 at SIGMOD '05

    We describe the architecture and some aspects of System R/X, a native XML database engine that is built on the same mature infrastructure for a relational database and integrated with the relational engine. We describe what parts of the infrastructure can be reused, what need to be extended, and what are totally new to the XML database and their techniques. Our overall strategy is to base XML storage and search on the scalable relational technology with substantial extensions. Many techniques…

    We describe the architecture and some aspects of System R/X, a native XML database engine that is built on the same mature infrastructure for a relational database and integrated with the relational engine. We describe what parts of the infrastructure can be reused, what need to be extended, and what are totally new to the XML database and their techniques. Our overall strategy is to base XML storage and search on the scalable relational technology with substantial extensions. Many techniques are novel to our knowledge. We also provide perspectives along the discussion and point out some open research issues.

    See publication
  • An Efficient XML Schema Typing System

    An XML Schema Typing System based on finite state automata is presented. Existing XML Schema validation parser are inefficient such that wide adoption of XML Schema by transaction processing systems is at risk in practice. The architecture proposed in this paper compile XML Schemata into annotated finite automata, which can support efficient validation against XML Schemata for XML documents and fragments. It can also support type annotation for XML Schema-validated XML data without going…

    An XML Schema Typing System based on finite state automata is presented. Existing XML Schema validation parser are inefficient such that wide adoption of XML Schema by transaction processing systems is at risk in practice. The architecture proposed in this paper compile XML Schemata into annotated finite automata, which can support efficient validation against XML Schemata for XML documents and fragments. It can also support type annotation for XML Schema-validated XML data without going through validation. Furthermore, they can provide quick access to type information in XML schemata for static and dynamic type checking during XML query processing. This typing system can be used as part of the runtime environment for XML processing languages that adopt XML Schema as the type system

    Other authors
    • Ning Wang
    • Peter Housel
    • Michael Franz
  • Query formulation from high-level concepts for relational databases

    UIDIS '99 Proceedings of the 1999 User Interfaces to Data Intensive Systems

    A new query formulation system based on a semantic graph model is presented. The graph provides a semantic model for the data in the database with user-defined relationships. The query formulator allows users to specify their requests and constraints in high-level concepts. The query candidates are formulated based on the user input by a graph search algorithm and ranked according to a probabilistic information measure. English-like query descriptions can also be provided for users to resolve…

    A new query formulation system based on a semantic graph model is presented. The graph provides a semantic model for the data in the database with user-defined relationships. The query formulator allows users to specify their requests and constraints in high-level concepts. The query candidates are formulated based on the user input by a graph search algorithm and ranked according to a probabilistic information measure. English-like query descriptions can also be provided for users to resolve ambiguity when multiple queries are formulated from a user input. For complex queries, we introduce an incremental approach, which assists users to achieve a complex query goal by formulating a series of simple queries. A prototype system with a multimodal interface using the high-level query formulation techniques has been implemented on top of a cooperative database system (CoBase) at UCLA.

    Other authors
  • Interactive Query Formulation Techniques for Databases

    PhD dissertation, UCLA

    Building a query tool for end-users to query a database has been an important aspect of the development of database technologies. To advance the state of the art of this technology, a set of techniques for interactive database query formulation is developed in this dissertation and a prototype system is implemented. The techniques include the following: (1) High-level query formulation for simple queries; (2) Incremental query answering for complex queries; (3) Associative query answering for…

    Building a query tool for end-users to query a database has been an important aspect of the development of database technologies. To advance the state of the art of this technology, a set of techniques for interactive database query formulation is developed in this dissertation and a prototype system is implemented. The techniques include the following: (1) High-level query formulation for simple queries; (2) Incremental query answering for complex queries; (3) Associative query answering for relevant information; and (4) Semantic query guidance for incremental query sessions. This set of techniques is aimed to make formulation of ad hoc queries, including both navigational and aggregational queries, easier for end-users.
    Our approach to query formulation is based on a semantic graph model, which is a semantic representation of the data in the database augmented with a probabilistic information measure. This graph model can be semi-automatically generated from the database schema. The high-level query formulator allows users to formulate their queries by specifying simple requests and constraints, without having to use the database query language. The formulator completes queries based on user input, ranks candidate queries according to probability, and uses English-like query descriptions for resolving ambiguity during the formulation process.

    Incremental query answering allows users to obtain the necessary information by constructing a series of simple queries rather than being burdened with query syntax and structures for complex queries. Furthermore, associative query answering provides additional relevant information for queries; query guidance provides suggestions for subsequent queries at each step in an incremental query session. All the techniques provide users with useful assistance in finding their desired information. Case-based and probabilistic reasoning techniques are used for associative query answering and query guidance. ...

  • Associations and roles in object-oriented modeling

    Conceptual Modeling - ER'97

    We present an extended ER model with entity, role, and association as the basic constructs for object-oriented modeling. The purpose of the constructs is to support object evolution and extension for long lived objects. A class hierarchy consists of a static part and a dynamic part. The static part is a classification of entity classes, while the dynamic part is the role classes played by entities. The interaction among objects are captured with association classes. Based on the observation…

    We present an extended ER model with entity, role, and association as the basic constructs for object-oriented modeling. The purpose of the constructs is to support object evolution and extension for long lived objects. A class hierarchy consists of a static part and a dynamic part. The static part is a classification of entity classes, while the dynamic part is the role classes played by entities. The interaction among objects are captured with association classes. Based on the observation that entities play roles in association with other entities, we provide a unified view on roles in associations and roles as an extension to objects. The proposed modeling constructs help developers better understand the interrelationship among entities, thus result in flexible implementations for dynamic systems.

    Other authors
  • Associative Query Answering via Query Feature Similarity

    IIS '97 Proceedings of the 1997 IASTED International Conference on Intelligent Information Systems (IIS '97)

    Associative query answering provides additional relevant information to the queries that is not explicitly asked, but is of interest to the user. For a given query, associative information may be derived from past user query cases based on the user type and the query context. A case-based reasoning approach that matches query features is proposed. Query feature consists of the query topic, the output attribute list, and the selection constraints. The similarity of the query feature is defined…

    Associative query answering provides additional relevant information to the queries that is not explicitly asked, but is of interest to the user. For a given query, associative information may be derived from past user query cases based on the user type and the query context. A case-based reasoning approach that matches query features is proposed. Query feature consists of the query topic, the output attribute list, and the selection constraints. The similarity of the query feature is defined and can be evaluated from the semantic model that is derived from the database schema. Query feature based associative attribute search is presented.

    Other authors
  • A performance model of space-division ATM switches with input and output queueing

    HICSS '96

    An analytic model for the performance evaluation of space-division asynchronous transfer mode (ATM) switches is presented. This model assumes that the switch has a fixed capacity of m, where 1<m<=N (N is the number of trunks). Other important parameters include arrival rate and buffer sizes. Numerical solutions for the maximum throughput, cell delay, and cell loss probability are given with simulation being utilized in order to validate the analytic model. For independent and identical…

    An analytic model for the performance evaluation of space-division asynchronous transfer mode (ATM) switches is presented. This model assumes that the switch has a fixed capacity of m, where 1<m<=N (N is the number of trunks). Other important parameters include arrival rate and buffer sizes. Numerical solutions for the maximum throughput, cell delay, and cell loss probability are given with simulation being utilized in order to validate the analytic model. For independent and identical Bernoulli arrivals, the study shows that the contention processes can be modeled as discrete M/D/m (FIFO or Random) queues, while input queues can be modeled by Geom/G/1 queues, and the output queues are G/sup [X]//D/1 queues. A closed-form approximation for cell delay when m>2 is given. The result shows that the performance of switches with a small capacity can approach that of output queueing. The model and result can be used for switch design analysis and higher layer performance models.

    Other authors
    • W. Bulgren
    • Victor Wallace
    See publication
Join now to see all publications

Patents

  • GLOBAL DISTRIBUTED TRANSACTIONS ACROSS MICROSERVICES

    Filed US TBD

    A global transaction system receives a transaction request for a plurality of database services of microservices...

    Other inventors
  • Key-Value Replication with Consensus Protocol

    Filed US TBD

    A replicated key-value store is implemented using a "last-write-wins" consensus protocol. To improve throughput and latency in cross-data-center configurations, a system deploys a cross-cluster, learner-only member to a cluster of nodes (e.g., a data center). The cross-cluster, learner-only member submits key-values received at local leader members to remote clusters. Conflicts between the key-values and initial values at the remote clusters are solved using a "last-write-wins" consensus…

    A replicated key-value store is implemented using a "last-write-wins" consensus protocol. To improve throughput and latency in cross-data-center configurations, a system deploys a cross-cluster, learner-only member to a cluster of nodes (e.g., a data center). The cross-cluster, learner-only member submits key-values received at local leader members to remote clusters. Conflicts between the key-values and initial values at the remote clusters are solved using a "last-write-wins" consensus protocol. (note: the hybrid logical clock can be used to resolve conflicts for a "last-write-wins" strategy.)

    Other inventors
  • Dynamic selection of optimal grouping sequence at runtime for grouping sets, rollup and cube operations in SQL query processing

    Issued US 9535952

    A method, apparatus, and article of manufacture for optimizing a query in a computer system. Grouping operations are optimized during execution of the query in the computer system by: (1) translating the grouping operations into a plurality of levels, wherein each of the levels is comprised of one or more grouping sets with the same number of grouping expressions; (2) deriving the grouping sets on a level-by-level basis, wherein the grouping sets in a base level are obtained from the database…

    A method, apparatus, and article of manufacture for optimizing a query in a computer system. Grouping operations are optimized during execution of the query in the computer system by: (1) translating the grouping operations into a plurality of levels, wherein each of the levels is comprised of one or more grouping sets with the same number of grouping expressions; (2) deriving the grouping sets on a level-by-level basis, wherein the grouping sets in a base level are obtained from the database and the grouping sets in a next one of the levels are derived by selecting as an input a smallest one of the grouping sets in a previous one of the levels with which it has a derivation relationship; and (3) combining the derived grouping sets into an output for the query.

    Other inventors
    See patent
  • System and method for adaptive vector size selection for vectorized query execution

    Issued US 9436732

    System and method embodiments are provided for adaptive vector size selection for vectorized query execution. The adaptive vector size selection is implemented in two stages. In a query planning stage, a suitable vector size is estimated for a query by a query planner. The planning stage includes analyzing a query plan tree, segmenting the tree into different segments, and assigning to the query execution plan an initial vector size to each segment. In a subsequent query execution stage, an…

    System and method embodiments are provided for adaptive vector size selection for vectorized query execution. The adaptive vector size selection is implemented in two stages. In a query planning stage, a suitable vector size is estimated for a query by a query planner. The planning stage includes analyzing a query plan tree, segmenting the tree into different segments, and assigning to the query execution plan an initial vector size to each segment. In a subsequent query execution stage, an execution engine monitors hardware performance indicators, and adjusts the vector size according to the monitored hardware performance indicators. Adjusting the vector size includes trying different vector sizes and observing related processor counters to increase or decrease the vector size, wherein the vector size is increased to improve hardware performance according to the processor counters, and wherein the vector size is decreased when the processor counters indicate a decrease in hardware performance.

    Other inventors
    See patent
  • Efficient methods and systems for consistent read in record-based multi-version concurrency control

    Issued US 9430274

    System and method embodiments are provided for consistent read in a record-based multi-version concurrency control (MVCC) in database (DB) management systems. In an embodiment, a method in a record-based multi-version concurrent control (MVCC) database (DB) management system for a snapshot consistent read includes copying a system commit transaction identifier (TxID) and a current log record sequence number (LSN) from a transaction log at a start of a reader without backfilling of a commit LSN…

    System and method embodiments are provided for consistent read in a record-based multi-version concurrency control (MVCC) in database (DB) management systems. In an embodiment, a method in a record-based multi-version concurrent control (MVCC) database (DB) management system for a snapshot consistent read includes copying a system commit transaction identifier (TxID) and a current log record sequence number (LSN) from a transaction log at a start of a reader without backfilling of a commit LSN of a transaction to records that are changed and without copying an entire transaction table by the reader; and determining whether a record is visible according to a record TxID, the commit TxID and a current LSN, wherein a transaction table is consulted only when the record TxID is equal to or larger than a commit TxID at a transaction start.

    See patent
  • Efficient method of using XML value indexes without exact path information to filter XML documents for more specific XPath queries

    Issued US 9430582

    A system and method is provided for query processing comprises: creating an index of a database and ordering a set of index candidates from the index into a list based on a set of heuristic rules. A query defining a query path is then reduced into a list of single path expressions. Each index candidate is matched against the list of single path expressions according to the ordering of the index candidates. The matched candidate nodes are also verified to insure that they satisfy the query path.

    Other inventors
    See patent
  • FLEXIBLE TASK SCHEDULER FOR MULTIPLE PARALLEL PROCESSING OF DATABASE DATA

    Filed US 20170228422

    A system and method of responding to a database query. A query is received for MPP database data stored on a plurality of processing systems. A total splits number of the database data, each split containing at least a portion of the database, is determined. If the total splits number splits is greater than a splits threshold number, partial task maps are created and streamed to the processing systems after compiling the query. If the total splits number is less than the splits threshold…

    A system and method of responding to a database query. A query is received for MPP database data stored on a plurality of processing systems. A total splits number of the database data, each split containing at least a portion of the database, is determined. If the total splits number splits is greater than a splits threshold number, partial task maps are created and streamed to the processing systems after compiling the query. If the total splits number is less than the splits threshold number, a complete task map for all splits is created and output to the plurality of processing systems.

    Other inventors
    See patent
  • Apparatus and Method for Managing Storage of a Primary Database and a Replica Database

    Filed US 20170097972

    System and method embodiments are provided for using different storage formats for a primary database and its replicas in a database managed replication (DMR) system. As such, the advantages of both formats can be combined with suitable design complexity and implementation. In an embodiment, data is arranged in a sequence of rows and stored in a first storage format at the primary database. The data arranged in the sequence of rows is also stored in a second storage format at the replica…

    System and method embodiments are provided for using different storage formats for a primary database and its replicas in a database managed replication (DMR) system. As such, the advantages of both formats can be combined with suitable design complexity and implementation. In an embodiment, data is arranged in a sequence of rows and stored in a first storage format at the primary database. The data arranged in the sequence of rows is also stored in a second storage format at the replica database. The sequence of rows is determined according to the first storage format or the second storage format. The first storage format is a row store (RS) and the second storage format is a column store (CS), or vice versa. In an embodiment, the sequence of rows is determined to improve compression efficiency at the CS.

    Other inventors
    See patent
  • System and Method for Database Query

    Filed US 20170091269

    A method includes receiving, by a database system, a query statement and forming a runtime plan tree in accordance with the query statement. The method also includes traversing the runtime plan tree including determining whether a function node of the runtime plan tree is qualified for just-in-time (JIT) compilation. Additionally, the method includes, upon determining that the function node is a qualified for JIT compilation producing a string key in accordance with a function of the function…

    A method includes receiving, by a database system, a query statement and forming a runtime plan tree in accordance with the query statement. The method also includes traversing the runtime plan tree including determining whether a function node of the runtime plan tree is qualified for just-in-time (JIT) compilation. Additionally, the method includes, upon determining that the function node is a qualified for JIT compilation producing a string key in accordance with a function of the function node and determining whether a compiled object corresponding to the string key is stored in a compiled object cache.

    Other inventors
    See patent
  • DATA PLACEMENT CONTROL FOR DISTRIBUTED COMPUTING ENVIRONMENT

    Filed US 20170031988

    A method includes dividing a dataset into partitions by hashing a specified key, selecting a set of distributed file system nodes as a primary node group for storage of the partitions, and causing a primary copy of the partitions to be stored on the primary node group by a distributed storage system file server such that the location of each partition is known by hashing of the specified key.

    Other inventors
    See patent
  • APPARATUS AND METHOD FOR UTILIZING DIFFERENT DATA STORAGE TYPES TO STORE PRIMARY AND REPLICATED DATABASE DIRECTORIES

    Filed US 20170031765

    An apparatus and method are provided for utilizing different data storage types to store primary and replicated database directories. Included is a first data storage of a first data storage type including a direct-access storage type. The first data storage is configured to store a primary database directory. Also included is a second data storage of a second data storage type including a share type. The second data storage is configured to store a replicated database directory that replicates…

    An apparatus and method are provided for utilizing different data storage types to store primary and replicated database directories. Included is a first data storage of a first data storage type including a direct-access storage type. The first data storage is configured to store a primary database directory. Also included is a second data storage of a second data storage type including a share type. The second data storage is configured to store a replicated database directory that replicates at least a portion of the primary database directory.

    Other inventors
    See patent
  • SYSTEM AND METHOD FOR DATA CACHING IN PROCESSING NODES OF A MASSIVELY PARALLEL PROCESSING (MPP) DATABASE SYSTEM

    Filed US 20170010968

    The present technology relates to managing data caching in processing nodes of a massively parallel processing (MPP) database system. A directory is maintained that includes a list and a storage location of the data pages in the MPP database system. Memory usage is monitored in processing nodes by exchanging memory usage information with each other. Each of the processing nodes manages a list and a corresponding amount of available memory in each of the processing nodes based on the memory…

    The present technology relates to managing data caching in processing nodes of a massively parallel processing (MPP) database system. A directory is maintained that includes a list and a storage location of the data pages in the MPP database system. Memory usage is monitored in processing nodes by exchanging memory usage information with each other. Each of the processing nodes manages a list and a corresponding amount of available memory in each of the processing nodes based on the memory usage information. Data pages are read from a memory of the processing nodes in response to receiving a request to fetch the data pages, and a remote memory manager is queried for available memory in each of the processing nodes in response to receiving the request. The data pages are distributed to the memory of the processing nodes having sufficient space available for storage during data processing.

    Other inventors
    See patent
  • Systems and Methods for Parallelizing Hash-based Operators in SMP Databases

    Filed US 20160378824

    A system and method for parallelizing hash-based operators in symmetric multiprocessing (SMP) databases is provided. In an embodiment, a method in a device for performing hash based database operations includes receiving at the device an database query; creating a plurality of execution workers to process the query; and building by the execution workers a hash table from a database table, the database table comprising one of a plurality of partitions and a plurality of scan units, the hash…

    A system and method for parallelizing hash-based operators in symmetric multiprocessing (SMP) databases is provided. In an embodiment, a method in a device for performing hash based database operations includes receiving at the device an database query; creating a plurality of execution workers to process the query; and building by the execution workers a hash table from a database table, the database table comprising one of a plurality of partitions and a plurality of scan units, the hash table shared by the execution workers, each execution worker scanning a corresponding partition and adding entries to the hash table if the database table is partitioned, each execution worker scanning an unprocessed scan unit and adding entries to the hash table according to the scan unit if the database table comprises scan units, and the workers performing the scanning and the adding in a parallel manner.

    Other inventors
    See patent
  • Query Plan and Operation-Aware Communication Buffer Management

    Filed US 20160364484

    Data messages having different priorities may be stored in different communication buffers of a network node. The data messages may then be forwarded from the communication buffers to working buffers as space becomes available in the working buffers. After being forwarded to the working buffers, the data messages may be available to be processed by upper-layer operations of the network node. Priorities may be assigned to the data messages based on a priority level of a query associated with the…

    Data messages having different priorities may be stored in different communication buffers of a network node. The data messages may then be forwarded from the communication buffers to working buffers as space becomes available in the working buffers. After being forwarded to the working buffers, the data messages may be available to be processed by upper-layer operations of the network node. Priorities may be assigned to the data messages based on a priority level of a query associated with the data messages, a priority level of an upper-layer operation assigned to process the data messages, or combinations thereof.

    Other inventors
    See patent
  • Apparatus and Method for Using Parameterized Intermediate Representation for Just-In-Time Compilation in Database Query Execution Engine

    Filed US 20160306847

    Embodiments are provided herein for using parameterized Intermediate Representation (IR) for just-in-time (JIT) compilation in database query execution engines. In an embodiment, a method supporting query JIT compilation and execution in a database management system includes identifying a central processing unit (CPU) intensive function in a query, and identifying, in the CPU intensive function, one or more parameters. The one or more parameters represent variables with values changeable at…

    Embodiments are provided herein for using parameterized Intermediate Representation (IR) for just-in-time (JIT) compilation in database query execution engines. In an embodiment, a method supporting query JIT compilation and execution in a database management system includes identifying a central processing unit (CPU) intensive function in a query, and identifying, in the CPU intensive function, one or more parameters. The one or more parameters represent variables with values changeable at different query instances. The CPU intensive function tis compiled to a parameterized IR including the one or more parameters. The parameterized IR of the CPU intensive function is saved in a catalog of parameterized IRs.

    Other inventors
    See patent
  • BIG DATA STATISTICS AT DATA-BLOCK LEVEL

    Filed US 20160306810

    System and method for storing statistical data of records stored in a distributed file system. In one aspect a statistical data block is allocated in a memory of a data node for storing statistical data of records stored in a storage disk of the data node. Each data block of the plurality of data blocks in the data node has a respective entry in the statistical data block, which is collocated with data blocks on the data node. Statistical data of records stored in the distributed file system…

    System and method for storing statistical data of records stored in a distributed file system. In one aspect a statistical data block is allocated in a memory of a data node for storing statistical data of records stored in a storage disk of the data node. Each data block of the plurality of data blocks in the data node has a respective entry in the statistical data block, which is collocated with data blocks on the data node. Statistical data of records stored in the distributed file system are collected, and written to statistical data block in the memory of the data node.

    Other inventors
    See patent
  • QUERY OPTIMIZATION ADAPTIVE TO SYSTEM MEMORY LOAD FOR PARALLEL DATABASE SYSTEMS

    Issued US 20160246842

    A method for adaptively generating a query execution plan for a parallel database distributed among a cluster of data nodes includes receiving memory usage data from a multiple data nodes including network devices, calculating a representative memory load corresponding to the data nodes based on the memory usage data, categorizing a memory mode corresponding to the data nodes based on the calculated representative memory load, calculating an available work memory corresponding to the data nodes…

    A method for adaptively generating a query execution plan for a parallel database distributed among a cluster of data nodes includes receiving memory usage data from a multiple data nodes including network devices, calculating a representative memory load corresponding to the data nodes based on the memory usage data, categorizing a memory mode corresponding to the data nodes based on the calculated representative memory load, calculating an available work memory corresponding to the data nodes based on the memory mode, and generating the query execution plan for the data nodes based on the available work memory, wherein the memory usage data is based on monitored individual memory loads associated with the data nodes and the query execution plan corresponds to the currently available work memory.

    Other inventors
    See patent
  • CONCURRENCY CONTROL IN A SHARED STORAGE ARCHITECTURE SUPPORTING ON-PAGE IMPLICIT LOCKS

    Filed US 20160092488

    Presented systems and methods can facilitate efficient and effective information storage management. A system may include a plurality of nodes, shared storage and a centralized lock manager. A storage management method can include: receiving an access request to information, performing a lock resolution process; and performing an access operation (e.g., read, information update, etc.). The information can be associated with a shared storage component. The lock resolution process can include…

    Presented systems and methods can facilitate efficient and effective information storage management. A system may include a plurality of nodes, shared storage and a centralized lock manager. A storage management method can include: receiving an access request to information, performing a lock resolution process; and performing an access operation (e.g., read, information update, etc.). The information can be associated with a shared storage component. The lock resolution process can include participating in a lock management process that manages a physical lock (P-lock), wherein the lock management process utilizes transaction information associated with an implicit lock process and proceeds without communication overhead associated with explicit requests for a logical lock.

    Other inventors
    See patent
  • Encoded data processing

    Issued US 8832046

    Techniques are provided for encoded data processing which allows for continuous data processing as encoded data changes. Data is decomposed into one or more blocks with each block containing at least one data record. At least one data record within a given block is encoded with a first encoding process selected from one or more encoding processes. The first encoding process is associated with the given data block. Techniques evaluate whether or not to implement an encoding change for a given…

    Techniques are provided for encoded data processing which allows for continuous data processing as encoded data changes. Data is decomposed into one or more blocks with each block containing at least one data record. At least one data record within a given block is encoded with a first encoding process selected from one or more encoding processes. The first encoding process is associated with the given data block. Techniques evaluate whether or not to implement an encoding change for a given block when updating a given data record in the given block. Responsive to the evaluation, the given block is re-encoded with a second encoding process. Responsive to the re-encoding, the association of the given block is updated. A map is formed to convert the given data record encoded with the first encoding process to the second encoding process so as to preserve comparative relationships of the given data record.

    Other inventors
    See patent
  • Archiving data in database management systems

    Issued US 8825604

    According to one embodiment of the present invention, at least a portion of data from a first processing system is archived onto a second processing system based on partitions of the data. A query received at the first processing system is processed at the second processing system to retrieve archived data satisfying the received query in response to determining at the first processing system that the received query encompasses archived data. Embodiments of the present invention further include…

    According to one embodiment of the present invention, at least a portion of data from a first processing system is archived onto a second processing system based on partitions of the data. A query received at the first processing system is processed at the second processing system to retrieve archived data satisfying the received query in response to determining at the first processing system that the received query encompasses archived data. Embodiments of the present invention further include methods, systems, and computer program products for archiving and accessing data in substantially the same manner described above.

    Other inventors
    See patent
  • System and Method for Massively Parallel Processing Database

    Filed US 20150293966

    In one embodiment, a method of performing point-in-time recovery (PITR) in a massively parallel processing (MPP) database includes receiving, by a data node from a coordinator, a PITR recovery request and reading a log record of the MPP database. The method also includes determining a type of the log record and updating a transaction table when the type of the log record is an abort transaction or a commit transaction.

    Other inventors
    See patent
  • Systems and Methods to Optimize Multi-version Support in Indexes

    Filed US 20150278270

    System and method embodiments are provided for multi-version support in indexes in a database. The embodiments enable substantially optimized multi-version support in index and avoid backfill of commit log sequence number (LSN) for a transaction identifier (TxID). In an embodiment, a method in a data processing system for managing a database includes determining with the data processing system whether a record is deleted according to a delete indicator in an index leaf page record corresponding…

    System and method embodiments are provided for multi-version support in indexes in a database. The embodiments enable substantially optimized multi-version support in index and avoid backfill of commit log sequence number (LSN) for a transaction identifier (TxID). In an embodiment, a method in a data processing system for managing a database includes determining with the data processing system whether a record is deleted according to a delete indicator in an index leaf page record corresponding to the record; and determining with the data processing system, when the record is not deleted, whether the record is visible according to a new record indicator in the index leaf page record and according to a comparison of a system commit TxID at the transaction start with a record commit TxID obtained from the index leaf page record.

    See patent
  • Scalable storage schemes for native XML column data of relational tables

    Issued US 8572125

    A method and system for providing a scalable storage scheme for native hierarchically structured data of relational tables, includes a base table with indicator columns with information pertaining to hierarchically structured data of a document, data tables for storing the hierarchically structured data corresponding to the indicator columns, and node identifier indexes corresponding to the data tables for mapping between the indicator columns and the hierarchically structured data in the data…

    A method and system for providing a scalable storage scheme for native hierarchically structured data of relational tables, includes a base table with indicator columns with information pertaining to hierarchically structured data of a document, data tables for storing the hierarchically structured data corresponding to the indicator columns, and node identifier indexes corresponding to the data tables for mapping between the indicator columns and the hierarchically structured data in the data tables. In an embodiment, actual data for each hierarchically structured data (such as XML) column is stored in a separate data table, and each data table has a separate node identifier index. The node identifier index is searched with a key containing the document identifier and a logical node identifier is used, and a record identifier of a record in the data table containing the node assigned the logical node identifier is retrieved.

    Other inventors
    See patent
  • Packing nodes into records to store XML XQuery data model and other hierarchically structured data

    Issued US 8543614

    A storage of nodes of hierarchically structured data uses logical node identifiers to reference the nodes stored within and across record data structures. A node identifier index is used to map each logical node identifier to a record identifier for the record that contains the node. When a sub-tree is stored in a separate record, a proxy node is used to represent the sub-tree in the parent record. The mapping in the node identifier index reflects the storage of the sub-tree nodes in the…

    A storage of nodes of hierarchically structured data uses logical node identifiers to reference the nodes stored within and across record data structures. A node identifier index is used to map each logical node identifier to a record identifier for the record that contains the node. When a sub-tree is stored in a separate record, a proxy node is used to represent the sub-tree in the parent record. The mapping in the node identifier index reflects the storage of the sub-tree nodes in the separate record. Since the references between the records are through logical node identifiers, there is no limitation to the moving of records across pages, as long as the indices are updated or rebuilt to maintain synchronization with the resulting data pages. This approach is highly scalable and has a much smaller storage consumption than approaches that use explicit references between nodes.

    Other inventors
    See patent
  • Method and system for XPath execution in XML (extensible markup language) data storage bank

    Issued US

    The invention discloses a method and system for XPath execution in an XML (extensible markup language) data storage bank. The method for XPath execution in the XML data storage bank comprises the following steps: analyzing step: analyzing the input XPath query by utilizing a simple path file to generate an execution tree related to the XPath query, wherein the simple path file is an XML file generated on the basis of the hierarchical structure of a plurality of XML files in the data storage…

    The invention discloses a method and system for XPath execution in an XML (extensible markup language) data storage bank. The method for XPath execution in the XML data storage bank comprises the following steps: analyzing step: analyzing the input XPath query by utilizing a simple path file to generate an execution tree related to the XPath query, wherein the simple path file is an XML file generated on the basis of the hierarchical structure of a plurality of XML files in the data storage bank, wherein the node name of each node in one generated XML file is generated by recording the label information of each node in the plurality of XML files in the data storage bank; and executing step: executing the execution tree on the data storage bank to generate a final execution result.

    Other inventors
    See patent
  • XML sub-document versioning method in XML databases using record storages

    Issued US 8161004

    A new sub-document versioning method for record storages of XML documents which uses virtual cutting points to ensure that a search tree is able to support multiple versions of sub-documents and provide efficient mechanisms for XML updating. Record boundaries and virtual cut points divide the two-dimensional space, the horizontal axis representing node identifiers in document order and vertical axis representing version numbers, into rectangles. The bottom corner of the rectangle is used to…

    A new sub-document versioning method for record storages of XML documents which uses virtual cutting points to ensure that a search tree is able to support multiple versions of sub-documents and provide efficient mechanisms for XML updating. Record boundaries and virtual cut points divide the two-dimensional space, the horizontal axis representing node identifiers in document order and vertical axis representing version numbers, into rectangles. The bottom corner of the rectangle is used to represent the rectangles and the corresponding information of the corner is added to the search tree index.

    Other inventors
    See patent
  • Multi-Versioning Mechanism for Update of Hierarchically Structured Documents Based on Record Storage

    Filed US 20110302195

    A method for multi-versioning data of a hierarchically structured document stored in data records includes: changing document data in one or more data records, each data record assigned a record identifier, the data record including a plurality of nodes assigned a node identifier, and the document assigned a document identifier; storing an update timestamp in a base table row referencing the document identifier; storing in each changed data record a start timestamp for a start of a validity…

    A method for multi-versioning data of a hierarchically structured document stored in data records includes: changing document data in one or more data records, each data record assigned a record identifier, the data record including a plurality of nodes assigned a node identifier, and the document assigned a document identifier; storing an update timestamp in a base table row referencing the document identifier; storing in each changed data record a start timestamp for a start of a validity period for the changed data record and an end timestamp for an end of the validity period; and storing the start timestamp and the end timestamp in one or more node identifier index entries referencing the document identifier, the record identifier, and the node identifier. A version of the document may be obtained using node identifier index entries satisfying a version timestamp.

    Other inventors
    See patent
  • Efficient locking protocol for sub-document concurrency control using prefix encoded node identifiers in XML databases

    Issued US 8019779

    A system and method for concurrency control of hierarchically structured data is provided. Lock requests on a target node are processed by exploiting ancestor-descendant information encoded into prefix encoded node identifiers (IDs). A set of implicit locks on ancestor nodes along a path from an immediate parent of a target node to a root node is derived from an explicit lock request on a target node. A logical lock tree describing existing lock modes for ancestor nodes is consulted to…

    A system and method for concurrency control of hierarchically structured data is provided. Lock requests on a target node are processed by exploiting ancestor-descendant information encoded into prefix encoded node identifiers (IDs). A set of implicit locks on ancestor nodes along a path from an immediate parent of a target node to a root node is derived from an explicit lock request on a target node. A logical lock tree describing existing lock modes for ancestor nodes is consulted to determine compatibility with the derived set of implicit locks. If existing lock modes for ancestor nodes are compatible with the derived set of implicit locks, a lock request on a target node is granted. Otherwise, the lock request is denied. A lock release request follows the reverse process; a target node in a particular transaction is released, as are subsequent locks on its ancestors made by the same transaction.

    Other inventors
    • Jim Teng
    • Brian Vickery
    See patent
  • Self-adaptive prefix encoding for stable node identifiers

    Issued US 7937413

    A variable-length binary string is utilized to encode node identifiers in a tree for an XML document object model. A general prefix encoding scheme is followed; a node identifier is generated by the concatenation of encodings at each level of a tree along a path from a root node to another particular node. Arbitrary insertions are supported without change to existing node identifier encodings. In addition, the method provides for document order when unsigned binary string comparison is used to…

    A variable-length binary string is utilized to encode node identifiers in a tree for an XML document object model. A general prefix encoding scheme is followed; a node identifier is generated by the concatenation of encodings at each level of a tree along a path from a root node to another particular node. Arbitrary insertions are supported without change to existing node identifier encodings. In addition, the method provides for document order when unsigned binary string comparison is used to compare encoded node identifiers. In support of sub-document concurrency control, prefix encoding provides a way to derive ancestor-descendant relationships among nodes in a tree. Lastly, the encoding method provides a natural pre-order clustering sequence, also known as depth-first clustering. If a prefix is applied to an encoding with a level number, starting with zero at the root, width-first clustering will result. A mixed clustering can also be supported.

    Other inventors
    • Brian Tran
    See patent
  • Efficient XML schema validation of XML fragments using annotated automaton encoding

    Issued US 7890479

    An XML schema is compiled into an annotated automaton encoding, which includes a parsing table for structural information and annotation for type information. The representation is extended to include a mapping from schema types to states in a parsing table. To validate a fragment against a schema type, it is necessary simply to determine the state corresponding to the schema type, and start the validation process from that state. When the process returns to the state, fragment validation has…

    An XML schema is compiled into an annotated automaton encoding, which includes a parsing table for structural information and annotation for type information. The representation is extended to include a mapping from schema types to states in a parsing table. To validate a fragment against a schema type, it is necessary simply to determine the state corresponding to the schema type, and start the validation process from that state. When the process returns to the state, fragment validation has reached successful completion. This approach is more efficient than a general tree representation. Only the data representation of the schema information is handled, making it much easier than manipulating validation parser code generated by a parser generator. In addition, only one representation is needed for schema information for both document and fragment validation. This approach also provides a basis for incremental validation after update.

    Other inventors
    See patent
  • Order-preserving encoding formats of floating-point decimal numbers for efficient value comparison

    Issued US 7685214

    A method for conversion between a decimal floating-point number and an order-preserving format has been disclosed. The method encodes numbers in the decimal floating-point format into a format which preserves value ordering. This encoding allows for fast and direct string comparison of two values. Such an encoding provides normalized representations for decimal floating-point numbers and supports type-insensitive comparisons. Type-insensitive comparisons are often used in database management…

    A method for conversion between a decimal floating-point number and an order-preserving format has been disclosed. The method encodes numbers in the decimal floating-point format into a format which preserves value ordering. This encoding allows for fast and direct string comparison of two values. Such an encoding provides normalized representations for decimal floating-point numbers and supports type-insensitive comparisons. Type-insensitive comparisons are often used in database management systems, where the data type is not specified for values to compare. In addition, the original decimal floating-point format can be recovered from the order-preserving format.

    Other inventors
    See patent
  • Annotated automaton encoding of XML schema for high performance schema validation

    Issued US 7493603

    A method and system for Extensible Markup Language (XML) schema validation, includes: loading an XML document into a runtime validation engine, where the runtime validation engine includes an XML schema validation parser; loading an annotated automaton encoding (AAE) for an XML schema definition into the XML schema validation parser; and validating the XML document against the XML schema definition by the XML schema validation parser utilizing the annotated automaton encoding. Each XML schema…

    A method and system for Extensible Markup Language (XML) schema validation, includes: loading an XML document into a runtime validation engine, where the runtime validation engine includes an XML schema validation parser; loading an annotated automaton encoding (AAE) for an XML schema definition into the XML schema validation parser; and validating the XML document against the XML schema definition by the XML schema validation parser utilizing the annotated automaton encoding. Each XML schema definition is compiled once into the AAE format, rather than being compiled each time an XML document is validated, and thus significant time is saved. The code for the runtime validation engine is fixed and does not vary depending on the XML schema definition, rather than varying for each XML schema definition, and thus space overhead is minimized. Flexibility in the validation process is provided without compromising performance.

    Other inventors
    • Gene Fuh
    • Yun Wang
    See patent
  • STREAMING XPATH ALGORITHM FOR XPATH EXPRESSIONS WITH PREDICATES

    Filed US 20080222176

    A method and system for evaluating a path query are disclosed. The path query corresponds to a query tree including a plurality of query nodes. At least one query node corresponds to at least one predicate and is at a level. The predicate(s) are evaluated for previous query node(s). The method and system include scanning data nodes of a document and determining if the data nodes match the query nodes. The method and system also include placing data related to the data node in match stacks…

    A method and system for evaluating a path query are disclosed. The path query corresponds to a query tree including a plurality of query nodes. At least one query node corresponds to at least one predicate and is at a level. The predicate(s) are evaluated for previous query node(s). The method and system include scanning data nodes of a document and determining if the data nodes match the query nodes. The method and system also include placing data related to the data node in match stacks corresponding to matched query nodes. The data for the query node(s) include attribute(s) corresponding to the predicate(s). The method and system further include propagating a matching of the at least one query node backward to a matching of the at least one previous query node.

    Other inventors
    See patent
  • Streaming XPath algorithm for XPath value index key generation

    Issued US 7346609

    A method generates hierarchical path index keys for single and multiple indexes with one scan of a document. Each data node of the document is scanned and matches to query nodes are identified. A data node matches a query node if the three conditions hold: if it is not the root step, there is a match for the query node in the previous step of the query; the data node matches the query node of the current step; and the edges of the data and query nodes match. A sub-tree of a data node can be…

    A method generates hierarchical path index keys for single and multiple indexes with one scan of a document. Each data node of the document is scanned and matches to query nodes are identified. A data node matches a query node if the three conditions hold: if it is not the root step, there is a match for the query node in the previous step of the query; the data node matches the query node of the current step; and the edges of the data and query nodes match. A sub-tree of a data node can be skipped if the data node is not matched and its level is less than the fixed levels of the query. The matched data node is then placed in the match stacks corresponding to the match query nodes. The method uses transitivity properties among matching units to reduce the number of states that need to be tracked and to improve the evaluation of path expressions significantly.

    Other inventors
    See patent
  • Materialized view signature and efficient identification of materialized view candidates for queries

    Issued US 7246115

    A method and system for efficiently identifying materialized view candidates for queries filters materialized views using certain criteria, using the materialized view signatures. This filtering rejects some of the unqualified materialized views prior to the performance of the query rewrite matching algorithm, resulting in a group of materialized view candidates. The query rewrite matching algorithm is then performed on the materialized view candidates. By first filtering the materialized views…

    A method and system for efficiently identifying materialized view candidates for queries filters materialized views using certain criteria, using the materialized view signatures. This filtering rejects some of the unqualified materialized views prior to the performance of the query rewrite matching algorithm, resulting in a group of materialized view candidates. The query rewrite matching algorithm is then performed on the materialized view candidates. By first filtering the materialized views based on their signatures, the number of materialized views on which the query rewrite matching algorithm is performed is significantly reduced, improving performance.

    Other inventors
    See patent
  • Method, system, and program for optimizing aggregate processing

    Issued US 7243098

    Disclosed is a method, system, and program for processing an aggregate function. Rows that contain a reference to intermediate result structures are grouped to form groups. For each group, aggregate element structures are formed from the intermediate result structures and, if the aggregate function specifies ordering, the aggregate element structures are sorted based on a sort key.

    Other inventors
    See patent
  • Query transformation for union all view join queries using join predicates for pruning and distribution

    Issued US 7188098

    A method, apparatus, and article of manufacture for optimizing a query in a computer system, wherein the query is performed by the computer system to retrieve data from a database stored on the computer system. The optimization includes: (a) combining join predicates from a query with local predicates from each branch of one or more UNION ALL views referenced by the query; (b) analyzing the combined predicates; and (c) not generating the join when the analysis step indicates that the combined…

    A method, apparatus, and article of manufacture for optimizing a query in a computer system, wherein the query is performed by the computer system to retrieve data from a database stored on the computer system. The optimization includes: (a) combining join predicates from a query with local predicates from each branch of one or more UNION ALL views referenced by the query; (b) analyzing the combined predicates; and (c) not generating the join when the analysis step indicates that the combined predicates lead to an empty result.

    Other inventors
    • Steve Chen
    • Ding-wei Chieh
    • Huong Tran
    • Yumi Tsuji
    See patent
  • Method, system, and program for optimizing processing of nested functions

    Issued US 7124137

    Disclosed is a method, system, and program for processing a function. A set of nested functions are received. A composite function is generated for the set of nested functions. A tagging template is generated for the set of nested functions that corresponds to the composite function. A result is produced by evaluating the composite function using the tagging template.

    Other inventors
    See patent
  • Eliminating superfluous namespace declarations and undeclaring default namespaces in XML serialization processing

    Issued US 7120864

    In one embodiment, at least a portion of an object model having at least one namespace is serialized. An ancestor namespace is searched for based on a current namespace declaration. The ancestor namespace is associated with an ancestor prefix and an ancestor uniform resource indicator (URI). The current namespace is associated with a current prefix and current URI. The search is performed to find an ancestor prefix matches the current prefix. When the current namespace is an implicit no default…

    In one embodiment, at least a portion of an object model having at least one namespace is serialized. An ancestor namespace is searched for based on a current namespace declaration. The ancestor namespace is associated with an ancestor prefix and an ancestor uniform resource indicator (URI). The current namespace is associated with a current prefix and current URI. The search is performed to find an ancestor prefix matches the current prefix. When the current namespace is an implicit no default namespace and the ancestor namespace is an explicit default namespace based on, at least in part, the ancestor prefix, a serialized namespace declaration is generated for the current namespace.

    Other inventors
    See patent
  • Efficient heuristic approach in selection of materialized views when there are multiple matchings to an SQL query

    Issued US 7089225

    A heuristic approach is used to order materialized view (MW) candidates in a list based on descending order of their reduction power. A query (e.g., SQL query) is then matched with the MVs in the list order, wherein searching is stopped when matching has been found. The query is matched with materialized views in the ordered list by identifying a materialized view candidate as follows: identifying an MV that is not locked by a REFRESH process; identifying a matching MV that does not require a…

    A heuristic approach is used to order materialized view (MW) candidates in a list based on descending order of their reduction power. A query (e.g., SQL query) is then matched with the MVs in the list order, wherein searching is stopped when matching has been found. The query is matched with materialized views in the ordered list by identifying a materialized view candidate as follows: identifying an MV that is not locked by a REFRESH process; identifying a matching MV that does not require a regroup; identifying a matching MV that does not require a rejoin; identifying a matching MV that does not require a residual join; or identifying an MV with largest reduction power from the list of candidates.

    Other inventors
    See patent
  • Method, system, and program for optimizing the processing of queries involving set operators

    Issued US 6792420

    Provided is a method, system, and program for processing a query including a query operation on a table derived from a set operation on two result tables. The query operation is performed on each result table separately to produce two intermediate result tables. The set operator is then applied to the two intermediate result tables to produce a final result table that is a same result table that would have been produced by performing the query operation on the table derived from the set…

    Provided is a method, system, and program for processing a query including a query operation on a table derived from a set operation on two result tables. The query operation is performed on each result table separately to produce two intermediate result tables. The set operator is then applied to the two intermediate result tables to produce a final result table that is a same result table that would have been produced by performing the query operation on the table derived from the set operation performed on the two result tables.

    Other inventors
    • Steve Chen
    • Yumi Tsuji
    • Yun Wang
    See patent
  • Efficient type annontation of XML schema-validated XML documents without schema validation

    Filed US 20050177578

    Type annotation record information storage for annotated automaton encoding for high-performance XML schema validation is optimized in a space efficient aspect. Subsequent to type annotation record information organization, type annotation records are used for type annotation of validated XML documents, either by implementing annotation records and type annotation part of an algorithm only, or by skipping one or more validation steps in a full validation implementation. Given a schema context…

    Type annotation record information storage for annotated automaton encoding for high-performance XML schema validation is optimized in a space efficient aspect. Subsequent to type annotation record information organization, type annotation records are used for type annotation of validated XML documents, either by implementing annotation records and type annotation part of an algorithm only, or by skipping one or more validation steps in a full validation implementation. Given a schema context, a type annotation may be performed for a validated XML fragment as opposed to an entire document. In addition, default features such as attribute and type are supported.

    Other inventors
    • Steve Chen
    • Ning Wang
    See patent
  • Query transformation for union all view join queries using join predicates for pruning and distribution

    Filed US 20050065926

    A method, apparatus, and article of manufacture for optimizing a query in a computer system, wherein the query is performed by the computer system to retrieve data from a database stored on the computer system. The optimization includes: (a) combining join predicates from a query with local predicates from each branch of one or more UNION ALL views referenced by the query; (b) analyzing the combined predicates; and (c) not generating the join when the analysis step indicates that the combined…

    A method, apparatus, and article of manufacture for optimizing a query in a computer system, wherein the query is performed by the computer system to retrieve data from a database stored on the computer system. The optimization includes: (a) combining join predicates from a query with local predicates from each branch of one or more UNION ALL views referenced by the query; (b) analyzing the combined predicates; and (c) not generating the join when the analysis step indicates that the combined predicates lead to an empty result.

    Other inventors
    • Steve Chen
    • Ding-wei Chieh
    • Huong Tran
    • Yumi Tsuji
    See patent
  • Matching groupings, re-aggregation avoidance and comprehensive aggregate function derivation rules in query rewrites using materialized views

    Filed US 20040122814

    A method, apparatus, and article of manufacture for optimizing a query in a computer system, wherein the query is performed by the computer system to retrieve data from a database stored on the computer system. The optimization includes: identifying a materialized view candidate in the computer system, matching a grouping of the materialized view with a grouping of the query using column equivalence and functional dependency, in order to determine whether re-aggregation is necessary, deriving…

    A method, apparatus, and article of manufacture for optimizing a query in a computer system, wherein the query is performed by the computer system to retrieve data from a database stored on the computer system. The optimization includes: identifying a materialized view candidate in the computer system, matching a grouping of the materialized view with a grouping of the query using column equivalence and functional dependency, in order to determine whether re-aggregation is necessary, deriving one or more aggregate functions requested by the query from the materialized view and any remaining tables in the query based on the matched groupings, and rewriting the query based on the matched groupings.

    Other inventors
    See patent
  • Method, system, and program for a join operation on a multi-column table and satellite tables including duplicate values

    Issued US 6374235

    Disclosed is a method, system, and program for performing a join operation on a multi-column table and at least two satellite tables having a join condition. Each satellite table is comprised of multiple rows and at least one join column. The multi-column table is comprised of multiple rows and at least one column corresponding to the join column in each satellite table. A join operation is performed on the rows of the satellite tables to generate concatenated rows of the satellite tables. One…

    Disclosed is a method, system, and program for performing a join operation on a multi-column table and at least two satellite tables having a join condition. Each satellite table is comprised of multiple rows and at least one join column. The multi-column table is comprised of multiple rows and at least one column corresponding to the join column in each satellite table. A join operation is performed on the rows of the satellite tables to generate concatenated rows of the satellite tables. One of the concatenated rows is joined to the multi-column table and a returned entry from the multi-column table is received. A determination is then made as to whether the returned entry matches the search criteria. If so, a determination is made as to whether one of the satellite tables has duplicates of values in the join column of the returned matching entry or the multi-column table has duplicate entries in the join columns.

    Other inventors
    See patent

Languages

  • English

    Full professional proficiency

  • Chinese

    Native or bilingual proficiency

Organizations

  • ACM

    -

  • IEEE Computer Society

    -

Recommendations received

More activity by Gene

View Gene’s full profile

  • See who you know in common
  • Get introduced
  • Contact Gene directly
Join to view full profile

Other similar profiles

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Others named Gene Zhang

Add new skills with these courses