![](https://secure.gravatar.com/avatar/0b0640ffc5b4800edacc43b8edbdf518.jpg?s=120&d=mm&r=g)
Hi All I've started analyzing my RNA-Seq data for two time points: Day0 and Day4 for control and treated. I've done aligning the data to the reference genome using Tophat. I've removed duplicates from the data sets. Could somebody please tell me, how important is it to remove duplicates and how will it influence my results if I don't remove? I want to start with Cufflinks all the way through to Cuffdiff. Where do I start since there are just so many options (in the manual) to choose from? What do I look for? Kind regards Lizex Disclaimer This message is confidential and may be covered by legal professional privilege. It must not be read, copied, disclosed or used in any other manner by any person other than the addressee(s). Unauthorised use, disclosure or copying is strictly prohibited and may be unlawful. The views expressed in this email are those of the sender, unless otherwise stated. If you have received this email in error, please contact ARC Service Desk immediately. (mailto:Servicedesk@arc.agric.za) To report incidents of fraud and / or corruption in the ARC use our Ethics Hotline by: Phone number : 0800 21 20 56 Fax number : 0800 200 796 Email address : fraud@kpmg.co.za For more information on the ARC Ethics Hotline, please visit our website at www.arc.agric.za.