Addressing the Problem of Feature Selection Using Genetic Algorithms

Posted by Munshi Imran Hossain

Jan 15, 2018 8:12:00 AM

The problem of feature selection

The explosion in the availability of big data has made complex prediction models a conspicuous reality of our times. Whether in banking, financial services and insurance, telecoms, manufacturing or healthcare, predictive models are increasingly used to derive inference from data.

Most of these models use a set of input variables, called features, to predict the output on a variable of interest. For example, the concentration of characteristic biomarkers in the blood can be used to predict the presence, absence or progress of certain diseases.

The available data can provide a large number of features, but generally, it’s preferable to use a small number of really relevant features in a model. This is because a model with more features has a greater complexity which leads to greater demand on computational resources and time to train the model. Therefore it is desirable to restrict the number of features in a predictive model. Choosing the subset of features that will result in a model with optimum performance is the problem of feature selection. This is essentially a problem of plenty.

Read More

Topics: big data, data science, genetic algorithm


Round-Up:The 6 Hottest Blog Topics from 2017

Posted by Cytel

Dec 21, 2017 6:44:00 AM

As we prepare to close the door on 2017, we thought we would take a look back at the  topics which have been most popular on the Cytel blog this year.  It's an interesting insight on what pain points and opportunities feature highly on our global biopharma audience's radar.  Read on to learn which of our 2017 blogs have received the most interest from our audience so far.

Read More

Topics: Oncology, Data Management, Cytel Strategic Consulting, Statistical Analysis, Trial Design Software, pharmacometrics, estimands, NONMEM, data science, R language


Signal Management Using R

Posted by Krishna Asvalayan

Dec 12, 2017 7:33:00 AM


Signal management is one of the most audited pharmacovigilance processes. It also generates one of the highest findings from audits. The ability of Marketing Authorisation Holders (MAHs) to make a robust signal management system that is fully audit/inspection ready sometimes falls short of expectations. Happily, technology can be used to make the process more scientific and rigorous.
Technology in the signal management process can be divided into two categories. The first one is the front end i.e. what platform (.Net/JAVA) is being used to develop the system. The second is the back end i.e. what programs/software (R, Python, SAS) are used to process the data. In this blog, we will focus on the second category and discuss how R specifically can help improve the signal management process.

Read More

Topics: Statistical Analysis, data science, pharmacovigilance, R language, signal detection


Removing 'Noise' from Biomedical Signals

Posted by Munshi Imran Hossain

Sep 7, 2017 7:00:24 AM

By Munshi Imran Hossain, Software Affiliate at Cytel

Biomedical signals are electrical signals collected from the body. Some of the most common ones are the electrocardiogram (ECG) and the electroencephalogram (EEG). These signals are of great value because they can be used for diagnostic purposes. Importantly, most of them can be collected using non-invasive methods. These attributes, together with the tremendous recent advances in electronic and digital processing technology, have made biomedical signal data an important source of data used in medical diagnostics.

Read More

Topics: signal processing, data science, biomedical signals, Fourier


The Cytel blog keeps you up to speed with the latest developments in biostatistics and clinical biometrics.  Sign up for updates direct to your inbox. You can unsubscribe at any time.

 

Posts by Topic

see all

Recent Posts