Site Title

Precision FDA — Free super computing power preloaded with applications!

Okay — I now have a PrecisionFDA account and I’ve just uploaded data to it, and I’m trying to reduce some spectra with RIDAR with it — and I have no idea why this is here, but I like it.

Disclaimer: Since this is an HHS US government thing it might be for US people only? But…I’m in Japan right now and I logged right in, even with the 2-factor identification thing.

You can go to PrecisionFDA here.

What do I know about it?

Well…they are the ones responsible for the CPTAC challenge to identify mislabled samples — so they’re clearly the good guys. That was cool, even if the start and end of the challenge deadline made it clear they didn’t expect anyone interested in participating had a job. Scientists tend to be busy people, yo. If you want them to volunteer for stuff and they see they have to start and complete in like 3 weeks, no one is going to take you up on it.

What else do I know about them? I just got a free account and uploaded data to their cluster. There are a bunch of tools there already but it currently looks like all dumb genome and transcriptome stuff, but if someone is going to let me run my tools on their power bill they’re cool people in my book.

‘Benefício social’: Câmara recebe projeto da Prefeitura que cria ‘Cartão do Bem’ em Quixeramobim

O prefeito Clébio Pavone enviou à Câmara Municipal de Quixeramobim um projeto de lei do executivo que cria o “Cartão do Bem”. A proposta foi o principal assunto desta quarta-feira, 20/11, quando ocorreu a sessão ordinária na casa legislativa.

Os vereadores rejeitaram a urgência simples e acataram o projeto que segue agora para a Comissão de Legislação, Justiça e Redação, que tem como presidente o vereador Everardo Júnior. Em conversa com a reportagem, o presidente Idelbrando Rocha revelou que, possivelmente, o projeto deve retornar ao plenário, desta vez com o parecer da comissão, para votação ainda este ano. É intenção de Pavone lançá-lo em dezembro.

O ‘Cartão do Bem’ prevê o pagamento de R$ 60,00, inicialmente, a 500 pessoas em situação de extrema pobreza no município. O pagamento, conforme revelou o prefeito Clébio Pavone, em entrevista ao Programa SerTão Conta Mais, da rádio Campo Maior AM 840, antes da votação, será proveniente de recursos próprios do município, administrado pela Secretaria de Assistência e Desenvolvimento Social (SADS), com base nas informações extraídas do Cadastro Único do Bolsa Família, programa do Governo Federal.

O assunto foi alvo de constante debate. O prefeito se dirigiu ao Legislativo, ao lado de secretários, incluindo Edson Facó, da Administração e Finanças, para acompanhar de perto a votação.

Como votaram

Seis vereadores foram favoráveis ao regime de urgência simples do projeto, ou seja, onde seriam deixadas de lado as formalidades para que se possa ir direto ao ponto: A apreciação do projeto na mesma sessão. Foram eles:

Antônio do Couto (Cego)
Antônio Filho
Edson Nogueira

Everardo Júnior
Fernando Antônio

Roberlan Saldanha

Já oito parlamentares foram contrários ao regime, desta forma, o projeto segue para a Comissão de Legislação, Justiça e Redação:

Claudinha Borges
Célio Neto

Cristina Pimenta
Evando Cosmo
Francisco José, o Kim
François Saldanha

Terezinha Pimentel

José Wilson Paulino

Era necessária maioria absoluta favorável para que a urgência fosse aprovada e ainda na sessão desta quarta-feira submetida a votação. O presidente não vota.

Quais os critérios de inclusão no programa
O projeto ainda pode sofrer alterações na Comissão. No entanto, o texto atualmente apresentado pelo prefeito, determina que o ‘Cartão do Bem’ tem como critérios para concessão do benefício os seguintes requisitos, resumidamente:

1 – Famílias com renda per capita (por pessoa) igual ou inferior a R$ 89,00.
2 – Que não estejam inseridas em mais de 1 programa de transferência de renda.
3 – Que possuem 2 membros, desde que um deles seja criança ou adolescente até 14 anos ou pessoa em situação de rua.
4 – Que estejam, no mínimo, inseridas há 24 meses no Cadastro Único.
5 – Que residem em condições de moraria precária (difícil acesso, sem água encanada, sem saneamento ou seja de taipa), salve pessoa em situação de rua.
6 – Pessoa em situação de rua com família unipessoal ou com membros.

Critérios de permanência no programa
1 – Participar regularmente de serviços, programas e projetos da SADS.
2 – Moradia livre de foco do Aedes Aegypti.
3 – Manter atualizado cadastro no CRAS.
4 – Manter vacinação e puericultura atualizados.
5 – Frequência escolar de crianças igual ou superior a 85% e adolescentes igual ou superior a 75%.

Além disso o projeto prevê ainda os critérios de desempate e evidentemente, de exclusão (neste caso, o não cumprimento dos demais).

Compromisso de campanha
Uma das principais propostas de Clébio Pavone durante a campanha eleitoral foi a criação de um ‘Bolsa Família Municipal’. A proposta chegou a receber inúmeras críticas de opositores, que o criticavam por não cumpri-lá. Superado os atrasos na folha de pagamento, Pavone joga para a Câmara Municipal, composta em sua maioria por opositores, a responsabilidade de aprovar a proposta, anteriormente criticada por não existir.

Traduzindo, agora está nas mãos do Legislativo a autorização para criação do ‘Cartão do Bem’.

A wild TMTPro Paper has appeared!!

About darned time!

Just accepted at JPR — the first (as far as I’m aware, please correct me if I’m wrong, study showing the use of TMTPro (previously TMT16-plex)!

Quick summary of my rapid readthrough:

1) This group typically uses NCE of 38 for TMT10/11-plex reagents, they use 32 for TMTPro (please keep in mind that proper HCD NCE can vary from system to system and there are ways to calibrate for that now. The important part is that the HCD is lower/closer to what we use for unlabled peptides! This is particularly good for those of us still using MS2 for TMT. The authors describe the use of both MS2 and SPS MS3 on their instrument. (And — in my hands an HCD of 32 on an Orbitrap Fusion lines up pretty close to a NCE HCD of 27 on a Q Exactives — again, varies from instrument to instrument, but this all sounds right to me!)

2) The larger tag makes the peptides a bit more hydrophobic (elute later) but it is a shift of a few minutes that can be easily adjusted for

3) When comparing number of peptide/protein IDs directly TMTPro results in a few percent less identifications, but you get 5 extra samples done simultaneously, so I still call that a win.

Concise, well-executed little study that will deserve the thousands of citations it will get for being the first one to press.

And — for those of us dying to get our hands on a TMTPro dataset — all these files have been deposited! (PRIDE PXD014750) I’m filling out the form on PRIDE now to have the files released for public download.

ThermoRawFileParser — A big little step away from Windows!

A long time ago I was in a relatively serious car accident. My recovery cost me two weeks of classes and I learned that concussions are seriously no fun at all. However, if you gave me an option of going through that again or migrating all my computers to Windows 10….I’d need time to think about it.

Unfortunately, like all of us, there is no choice at all. Windows 10 support is ending and our field is intrinsically tied to this Cortana and Bing infused catastrophe. Or is it? What is still missing?

Sure — the instruments need to run on a corporate operating system but there are increasing numbers of options for the data processing that don’t involve somoeone running an ad to try and sell you stuff while looking for your stuff on your hard drive

(If you do run into an Ad inside your computer, this tutorial will help. This appears disabled in Enterprise versions, but who knows for how long? I removed Cortana from the SysReg manually, and on the next update, there she is, helpfully taking me to a place to purchase Kanye’s new album every time I type the exact name of an Excel spreadsheet into the search bar.

“..thanks Bing! You’re the best!”

I should sleep more. This is getting out of hand.

Certain bioinformaticians in our field have been leading the charge away from Windows for quite some time and my obsession with learning how to follow them is filling the pages of this increasingly strange blog these days. And ThermoRAWFileParser couldn’t have come at a better time!

I’m working on installing the ProteoWizard on our cluster now, and as far as I can tell there is still considerable extra functionality in it that I should definitely still get both up there, but this new tool has some really cool advantages as well, including the direct production of JSON metadata files. And, in a head to head with msConvert, it appears the new tool produces mzML files more accurately, as they result in more total peptide IDs!

Re-Identifiability of Proteomic Data and its Implications….

Ummm….okay…so this is open access and it addresses one of the biggest (and scariest) elephants in the room. I hate to keep drawing attention to it, but with 40+ peer-reviewed studies on forensic proteomics in 2019 already, we need to start talking about this.

Anyone in the world can go to ProteomeXchange and download data from one of the repository partners like PRIDE or MASSIVE. If there is personally identifiable information in there, do we need to be thinking about this? ~~Albert~~ Heck, do we need to start having this conversation with the general scientific community and/or…yikes…government regulators…?

This thoughtful paper addresses these and (IMAHMFO) properly describes them as “dilemmas”.

With genetics we need to be extremely cautious with how the data is made anonymous — and explicit disclosure agreements and fancy government forms for release of genetic data with descriptions of the potential consequences. I think I’ve been told that there are people at Hopkins who do this stuff as a job, informing patients of their rights when they’re participating in big genetics studies.

If you could track single amino acid variants specific to people in things as benign as hair? It doesn’t seem all that hard to imagine that you could definitely identify a person and stuff about them from a plasma proteome, right? Maybe y’all on the biology side are already doing this stuff and I should just get out of the noisy room more? I hope so!

Unnatural selection — 100% Recommended Documentary!

If you need to catch up on a ton of those genetics terms and techniques you’ve heard people mumbling about, there might not be a better or more interesting way than this new documentary.

CRISPR stuff? Check!
GeneDrive stuff? Check!
Some…interesting….looking “Biohacker” guys saying reasonably accurate science things and then injecting themselves with stuff?

What to do with 100,000 core hours of super computer access!??!

I think I just successfully convinced @SpecInformatics to throw in on a study where we try to do ALL THE PROTEOMICS THINGS on High Performance Computing.

I just got 100,000 core hours for free, and I was told that if I could come up with a valid excuse I could probably have another 300,000 hours to use in the next 365 days.

Lesson 1)

CompOmics FTW! The amazing people at the UVA HPC were easily able to set up an Anaconda module and — BOOM — SearchGUI.

Interesting thing I forgot 2 Linux boxes ago — or honestly didn’t know — while SearchGUI installs your 10 search engines with the Windows install package, they might not automatically install in the Linux versions. Okay — but this is can be a huge advantage.

Wait — you know about SearchGUI. I ramble about it all the time. Okay — if you don’t — SearchGUI is this amazing idea from a bunch of smart Belgish(?) people? who said — wow, there are a lot of amazing search engines out there for free but most of them are a pain in the arse to set up and use, so people using one aren’t going to have the energy to set up the others. Can we fix this? Oh…and choosing just one is dumb….let’s fix that too!” And you get — 10 engines you’ve heard of — in a super easy interface!

(I was only running with decoy search off because I was trying to troubleshoot something odd.)

It’s an amazing bit of convenience and power that you can get here. I can’t recommend it enough. I even started making tutorial videos for it a couple years ago and forgot it completely. Maybe I’ll finish them later! My calendar says there is some free time coming up in August of 2024.

Can you imagine how much work it would be for this group to keep up in the improvements of each of these engines? They do a great job, but the awesome Comet engine has had at least 2 updates since ASMS 2019, which I’m convinced was yesterday.

I don’t know how to do it yet, but it looks like I can just get going with the newest version! Success!!

Right now I’ve just got Novor and DirecTag going — because if you’ve got 100,000 computational core hours and you don’t go after de novo first you probably don’t need it. I always need de novo!

How long does this HPC need to NovoR + DirecTag search a human Hela MGF file from 200ng from a QE HF? (I’ve got ProteoWizard, I’ve just got to get it set up properly so it will accept .RAW and .d)

About 60 seconds for both. Interestingly, at 3AM it is about 40% faster than 1pm….

If you’ve got an HPC on your campus — go talk to the nice people that run it — and see if it can be an asset for you! My next plan — MAXQUANT — because —

MaxQuant isn’t just for Windows anymore!!!

Great GalaxyP Tutorials hosted at GalaxyProject.eu

Have you seen the great new application study where GalaxyP was used and thought…okay….

The arguments are building up for why you need this.

Proteogenomics?
TAILS?
MetaProteomics?
Secretomics?

If you’re also thinking “…wait…remind me what Galaxy is again…? I know I saw a talk from that really cool guy from Minnesota (Pratik)”

Galaxy is a flexible interface for linking all sorts of tools on super computer thingies. GalaxyP is the proteomics version. You can have someone smart build you a GalaxyP instance on your supercomputer thing — but there is a cooler way of doing this — you can just borrow time on someone else’s!

GalaxyProject.EU has workflows built in that you can use AND they have loads of tutorial stuff so you aren’t starting alone on that terrifying project.

You can directly access all this stuff here.

Challenges and Opportunities for Mass Spec cores in the Developing World!

This article isn’t brand new, but I just stumbled across it and really appreciated the perspective on it. It’s open and available here.

1) How do you get funding to set up and run a core outside of where most of them are?
2) What challenges would you face if you packed up and decided to go there? Yo…the 24 hours to pump down your Orbitrap after every brown out….that sounds like a blast, right?
3) And this is the absolute best part of the article — the Opportunities! — yes, there is all sorts of great basic science that you can do with baker’s yeast. But — there are diseases the World Health Organization reference lists as serious people killers that I’ve never heard of, and I bet that almost no proteomics or metabolomics has ever been done on. There is such an opportunity to do good and have an impact that we can’t possibly ignore the development of biological mass spec in the developing world.

Yeah, you could argue that you could send more samples here, but have you gotten human samples from Africa before? I have and I wish I knew about this new technique that helps you tell how many freeze/thaws your samples have been through! When your samples are coming thousands of miles there is a very good chance that some valuable data may be lost, particularly in molecules that might not be as structurally robust.

ProtRank — Go beyond protein value imputation!

How we deal with “missing values” may always be controversial and I’m going to assume that no level of improvements in mass spectrometry engineering is going to be able to fix this. Sure, we can get better coverage, but sometimes that peptide just isn’t going to be there — maybe because it’s a got a single amino acid variant (SAAV) or maybe because it’s got a post translational modification in patient/or condition A that is not present at all in B.

At some level, though, we’ve got tough decision to make. Do you reeeeeeaaaallllly want to divide by zero? Or do you want to throw out that whole peptide measurement in your downstream analysis pipeline? It often makes sense to impute a value for that peptide or molecule that you can’t see in your extracted chromatogram.

ProtRank may not be the ultimate solution (…cause…realistically there may not be one universal solution…), but it’s a different take on this old problem. You can read about it in this new open article.

ProtRank is assembled in Python and is available at github here.

This study is interesting in it’s examination of some extreme dataset models and looks at the biases typical imputation methods cause in them. One place that is really scary to impute is phosphoproteomics. A lot of phosphorylation sites change to such an extent that they exceed the linear dynamic range of the instruments (I don’t fall into the school of thought that there are truly 100% on/off switches, I think it’s different bi-stability cliffs — I almost threw in some references here, but I really should go to work). Do you impute here?

Want to talk about a nightmare dataset? They look at phosphoproteomic shifts in IRRADIATED CELLS. DNA damage repair functions through phosphorylating everything it can to stop processes that make the radiation damage worse. The increases in phosphorylation are probably as big as you can get. Imputing some values shifts the data to the point that you lose a lot of the known phosphorylation changes. Whoops.

How much better does ProtRank do? In some part we have to wait and see. It is applied in a big biological study that is in preparation. This is the introduction and logic behind the code, and a nice way to say “download me!” So…