The Cytel blog keeps you up to speed with the latest developments in biostatistics and clinical biometrics.

Career Perspectives: Interview with Munshi Imran Hossain, Senior Data Scientist

October 23, 2018

Cytel data scientists apply advanced statistical techniques including predictive modeling of biological processes and drug interactions to unlock the potential of big data.

In this blog we talk to Munshi Imran, who is based in Pune, India to find out more about his career path, current role at Cytel and his interests outside of work.

Read More

Career Perspectives: Interview with Andrea Hita, Biomedical Data Scientist

June 7, 2018

Cytel data scientists apply advanced statistical techniques including predictive modelling of biological processes and drug interactions to unlock the potential of big data.

In this blog from our Career Perspectives series, we talk to Andrea Hita, at Data Scientist at Cytel, to find out more about her career path, her current role at Cytel and her interests outside of work.

Read More

Addressing the Problem of Feature Selection Using Genetic Algorithms

January 15, 2018

The problem of feature selection

The explosion in the availability of big data has made complex prediction models a conspicuous reality of our times. Whether in banking, financial services and insurance, telecoms, manufacturing or healthcare, predictive models are increasingly used to derive inference from data.

Most of these models use a set of input variables, called features, to predict the output on a variable of interest. For example, the concentration of characteristic biomarkers in the blood can be used to predict the presence, absence or progress of certain diseases.

The available data can provide a large number of features, but generally, it’s preferable to use a small number of really relevant features in a model. This is because a model with more features has a greater complexity which leads to greater demand on computational resources and time to train the model. Therefore it is desirable to restrict the number of features in a predictive model. Choosing the subset of features that will result in a model with optimum performance is the problem of feature selection. This is essentially a problem of plenty.

Read More