quick search
GO

quick links

Announcement

Flint Studios Ltd have acquired the business of Finisco Ltd

We deliver commercially successful projects and exceptional user experiences utilising innovation and cutting edge technology. Speak to us about software and web development, systems design and integration, SEO, branding & print.

Flint Studios Ltd, Scottish Provident Building, 7 Donegall Square West, Belfast. BT1 6JH
t: 028 9091 8435    e: info@flintstudios.co.uk    w: www.flintstudios.co.uk

Genna Summary


GENNA is a hybrid data mining algorithm that couples the strengths of the Nearest Neighbour and Genetic data mining algorithms to provide accurate models implicit within the data set provided to it for learning. Typically, the Nearest Neighbour algorithm is dependent on the modelling expert to optimise various parameters that can affect the performance of the model. GENNA uniquely uses the Genetic Algorithm to automatically optimise these parameters making the algorithm easy to use. Additionally GENNA uses innovative indexing mechanisms to speed up the prediction process which has traditionally been a bottleneck with Nearest Neighbour algorithms.

GENNA can be used for classification and regression, predictive tasks. The perspicuity of the model and cognitive basis makes it particularly suited to applications where justification of individual predictions are key. Examples of such domains are government and medicine.

Typical example applications to which GENNA has been applied include:
• Churn Analysis
• House Price Prediction for Mass Appraisal
• Prognosis of Colorectal Patients


Genna Uniqueness | Key Features | Userbase | System Requirements

Genna Uniqueness



Ability to use
Censored Observations

Most Data Mining techniques tend to ignore the concept of censored observations, assuming that the observed is the time of occurrence of the event. While this approach may be convenient, it leads to strong biases within the model, as the true distribution of the predicted field could be very different from the observed distribution. GENNA uniquely provides distance metrics and prediction mechanisms to explicitly handle censored observations by combining elements of evidence theory into the prediction process and well established statistical techniques like Kaplan-Meier and Wilcoxon’s test.

Ability to use Categorical and Numeric Attributes through the use of innovative distance metrics
Generally, nearest neighbour algorithms use similarity metrics that are either more suited to categorical attributes or numeric attributes. Using both these types of attributes together introduce biases within the
Ability to (semi-) automatically optimise the similarity metric used for comparable retrieval. GENNA uses innovative similarity metrics that are suitable for use by numeric as well as categorical attributes.

Automatic Indexing of data for Scalability and Speed
One of the shortfalls of the nearest neighbour family of algorithms is that as they do not build “compact” models from data for use in predictions, as the data volume increases, the speed of the prediction process can suffer. To alleviate this problem, GENNA automatically indexes the data using clustering techniques to speed up the prediction process.

Incremental Learning and Introspection
Once a model is built using data mining, an important part of the deployment is the monitoring of the accuracy of the predictions made by the model. Over a period of time, the context of the application of the model changes, a concept referred to as Concept Drift in Machine Learning literature. With this shift in context the model becomes less accurate in its application. Most data mining algorithms would need to be reapplied to new data resulting in a new model being built and applied within the new context.

GENNA approaches this problem differently, as new data is collected, whether the data represents new observations or feedback from the application of the model, it is incorporated into the current model. If the data is actually new observations this continuous learning is referred to as Incremental learning. The incorporation of data on the accuracy of the model’s application on the other hand is referred to as Introspection.

back to top

Genna Key Features


Attribute Weights: User or (Semi) Automatic Generation
GENNA provides the user with three options for optimising the similarity metric used in comparable retrieval. In the first instance the user of the algorithm can provide a weighting for each of the dependent variables in the input data. Secondly, the user can provide a ranking of the attributes based on domain knowledge resident in the user.

This ranking is taken into account by the optimisation carried out by the Genetic Algorithm within GENNA. Finally, the user can suggest that the Genetic Algorithm generate the weights autonomously. After the generation of the weights the user can tune the weights and generate new models to obtain insights into the sensitivity of the model to changes in the individual attribute weights.

Flexibility
GENNA provides the user with greater flexibility with regards to affecting the type of model that is generated through the setting of parameters of the algorithm that affect the nature of the distance metric employed, the prediction method employed and the number of comparables used within the prediction phase of the model. The user can also influence the type of error distribution generated by the application of the model through the setting of a parameter that affects the trade-off used by the algorithm between accuracy and variability of the model. The algorithm uses well-established statistical metrics to generate a measure the estimated accuracy and variability of the model.

back to top

Genna Userbase


Genna provides a valuable data-mining algorithm for the following vertical markets:

• Telecommunications
• Health Care
• Banking Finance
• Insurance
• Re-Insurance
• Manufacturing
• Retail
• Valuers (Public/Private)
• Building Societies
• Banks
• Financial Institutions
• Estate Agents
• Public Sector
• Academia

back to top

Genna System Requirements



Hardware Pentium III 550MHz with 256 RAM (512 Mb Recommended) and a CD-ROM drive for installation is also required.

Operating system Windows 2000 (Service Pack 3)/ XP (Service Pack 1) and Microsoft Office 2000

back to top
 Our highly skilled team of designers, developers and SEO experts have been involved in the research and development of business solutions for over 10 years.


Chairman
Finisco Group Limited