baoilleach/7thRDKitUGM.txt

## 7thRDKitUGM.txt
CzodrowskiPaul Very comprehensive summary of the #RDKitUGM2018! Kudos to Pat! https://t.co/dhwZhjFu0u

dr_greg_landrum After a really good #RDkitUGM2018 it was great to get out into the mountains today and concentrate on moving instead of chemInformatics. ;-) https://t.co/QoDAp3g3Qp

CzodrowskiPaul @wpwalters @dr_greg_landrum @AndreasBenderUK And kudos to all of you who came over from North, South America and even Japan! #RDKitUGM2018

dr_greg_landrum @AndreasBenderUK Thanks for hosting Andreas. Cambridge was a great place to have the meeting and the social events were exemplary! #RDKitUGM2018
-->  wpwalters @dr_greg_landrum @AndreasBenderUK Loads of fun and lots of great science.  Thanks to the organizers and all who participated.
    --> CzodrowskiPaul @wpwalters @dr_greg_landrum @AndreasBenderUK And kudos to all of you who came over from North, South America and even Japan! #RDKitUGM2018

AndreasBenderUK Thanks everyone for coming to Cambridge over the last few days - it was a pleasure having you here! Have a good trip home and until soon, Andreas #rdkitugm2018
--> dr_greg_landrum @AndreasBenderUK Thanks for hosting Andreas. Cambridge was a great place to have the meeting and the social events were exemplary! #RDKitUGM2018
    -->  wpwalters @dr_greg_landrum @AndreasBenderUK Loads of fun and lots of great science.  Thanks to the organizers and all who participated.
        --> CzodrowskiPaul @wpwalters @dr_greg_landrum @AndreasBenderUK And kudos to all of you who came over from North, South America and even Japan! #RDKitUGM2018

iwatobipen I really enjoyed the mtg. I am grateful thank to everyone! :-) #RDKitUGM2018
--> CzodrowskiPaul @iwatobipen Have a safe trip home, it's a long journey for you!
    --> iwatobipen @CzodrowskiPaul Thank you! See you again!

nathanbroon @SantaFeDoshin wanted to say I loved your elegant abuse of SMILES at the #RDKitUGM2018

wojcikowskim Had very interesting and enlightening talks over the last few days at #RDKitUGM2018 Thanks for all the presentations and beers afterwards! Looking forward to seeing all of you on the next UGM!
--> CzodrowskiPaul @wojcikowskim there was still to little time to talk to all RDKitters
    --> wojcikowskim @CzodrowskiPaul Do you think we could get enough?
    --> CzodrowskiPaul @wojcikowskim I think there is never ever enough time, even if the UGM lasts over one week..
    --> dr_greg_landrum @CzodrowskiPaul @wojcikowskim Agreed. And I’m definitely not going to moderate all the talks next time. That really got in the way of me being able to talk to folks.
        --> CzodrowskiPaul @dr_greg_landrum @wojcikowskim If it comes to moderation: I can help you, even at the cost of fewer tweets

 RSC_CICAG That’s the close of the #RDKitUGM2018 for another year. See you all next year. Location TBD!
-->      rguha @RSC_CICAG A US location next time?
    -->  RSC_CICAG @rguha Are you volunteering? 😏
    --> dr_greg_landrum @rguha @RSC_CICAG I'd love to do a US meeting, perhaps as a complement to (not replacement for) the European one.
It does require a local organizer though.
        --> baoilleach @dr_greg_landrum @rguha @RSC_CICAG Could be called the USGM vs the EUGM.
        -->      rguha @RSC_CICAG I could certainly help, though maybe a more serious RDKit’er than me should lead

CzodrowskiPaul #RDKitUGM2018 I'm heading out, enjoy the rest of the meeting!

CzodrowskiPaul #RDKitUGM2018 Another rocket from Roger Sayle's fireworks "The only reason why QSAR has not worked over all the years, you have been simply calculated molecular in the wrong way"
-->    pwk2013 @CzodrowskiPaul @marwinsegler The problem may be more connected with the number of parameters exceeding the number of IC50 measurements...
-->      rguha @CzodrowskiPaul Did he say anything about the right way?
    --> baoilleach @rguha @CzodrowskiPaul Oh boy, did he ever. There was the bit about the relativistic corrections for the rest mass of an electron, you know, for charged species, and then the bit about the application of the binomial theorem to work out the isotopolog, Slides will be available early next week I expect.
    --> marwinsegler @pwk2013 @CzodrowskiPaul Certainly not. As Roger’s @nmsoftware talk clearly highlighted, this can be to a large degree attributed to an inappropriate treatment of the electron rest mass when calculating molecular weight
        -->    pwk2013 @marwinsegler @CzodrowskiPaul @nmsoftware This is a excellent news and surely the resulting #LigandEfficiency #metrics will disrupt #DrugDiscovery foer decades to come

CzodrowskiPaul #RDKitUGM2018 One rocket from Roger Sayle's fireworks "RDKit is even better than expected" (when it comes to the calculation of molecular weight)

janhjensen Based on the tweets I really wish I had gone to #RDKitUGM2018
-->  phisch124 @janhjensen I thought the same. I hope I'll be able to join next year.

CzodrowskiPaul #RDKitUGM2018 Roger Sayles quotes Andrew Grant "The greatest contribution to cheminformatics to drug discovery have been molecular weight and logP"

 RSC_CICAG The tasks Roger will be tackling today… #RDKitUGM2018 https://t.co/AnwsmaXo37

baoilleach @nmsoftware #RDKitUGM2018 1. Calculating mol wt 2. Counting lines in a text file 3. How to calculate a %
--> baoilleach @nmsoftware @janhjensen And you could have met the other Jan Jensen. Though he didn't make it either this year.
    --> janhjensen @baoilleach @nmsoftware It’s almost like I missed it twice 🙂

     jwmay Fun fact the foundation of https://t.co/6ReCchzqLn was started when I was at EBI, so of course PDBe have never heard of it :-) #RDKitUGM2018. Can't comment on how good the @the_cdk is a reading mmCIF either with it's parser or JUMBO.

 RSC_CICAG Now for the last talk of #RDKitUGM2018, and the now-traditional ‘Roasting of Greg and the RDKit’ by Roger Sayle (@nmsoftware) https://t.co/hjsA2ar8dh
-->  RSC_CICAG The tasks Roger will be tackling today… #RDKitUGM2018 https://t.co/AnwsmaXo37

baoilleach #RDKitUGM2018 Roger Sayle from @nmsoftware on Deceptively simple: How some cheminf problems can be more complicated than they appear
--> baoilleach @nmsoftware #RDKitUGM2018 1. Calculating mol wt 2. Counting lines in a text file 3. How to calculate a %
-->   Piman314 @baoilleach @nmsoftware So sad that I missed Roger talk, it's alway a highlight. "you either die a hero or live long enough for Roger to absolutely slam your work in a talk"
    --> markussitzmann @Piman314 @baoilleach @nmsoftware He seems to have a some fondness regarding #RDKit for doing exactly this 🙂
    --> baoilleach @nmsoftware @janhjensen And you could have met the other Jan Jensen. Though he didn't make it either this year.
        --> janhjensen @baoilleach @nmsoftware It’s almost like I missed it twice 🙂

baoilleach #RDKitUGM2018 Oh oh Roger's up.
--> baoilleach #RDKitUGM2018 Roger Sayle from @nmsoftware on Deceptively simple: How some cheminf problems can be more complicated than they appear
    --> baoilleach @nmsoftware #RDKitUGM2018 1. Calculating mol wt 2. Counting lines in a text file 3. How to calculate a %
    -->   Piman314 @baoilleach @nmsoftware So sad that I missed Roger talk, it's alway a highlight. "you either die a hero or live long enough for Roger to absolutely slam your work in a talk"
        --> markussitzmann @Piman314 @baoilleach @nmsoftware He seems to have a some fondness regarding #RDKit for doing exactly this 🙂
        --> baoilleach @nmsoftware @janhjensen And you could have met the other Jan Jensen. Though he didn't make it either this year.
            --> janhjensen @baoilleach @nmsoftware It’s almost like I missed it twice 🙂

baoilleach #RDKitUGM2018 Shout out to Oliver Smart from Lukas Pravda for starting the move to #rdkit and tidying up the pipeline.

baoilleach #RDKitUGM2018 Ref to custom templates for images of ruthenium complexes in PDBeChem. (ed:Yay for ruthenium complexes!! https://t.co/As4bT3HkCN)

CzodrowskiPaul #RDKitUGM2018 https://t.co/sjA54drkot

iwatobipen PDBe RESTFUL API and PDBeChem
https://t.co/LXXbUFcUfl…

https://t.co/zxy0kTBRI6…
 #RDKitUGM2018

 RSC_CICAG Penultimate talk of #RDKitUGM2018 from Lukas Pravda on the use of the #RDKit in @PDBeurope. @RDKit_org https://t.co/IpvDfjV5Lr

baoilleach #RDKitUGM2018 Excellent presentation by Andrea Morger on Conformal prediction for toxicity read-across. (ed: I need to study it up.)

 RSC_CICAG Last session of the UGM starts with Andrea Morger (Charité Berlin) talking on machine learning for toxicology prediction. #RDKitUGM2018 https://t.co/ZWMuzdKweh

nathanbroon @baoilleach That’s super cool, Noel. Will take a read of your preprint later. #RDKitUGM2018
--> baoilleach @nathanbroon Yeah, later. No but that's okay. I'm cool with that.
    --> nathanbroon @baoilleach Just read it and pleased you can roundtrip without error. Just wondering if you’ve thought about the canonicalisation of the parent molecule leading to non-canonical substructures? Could you represent the SMILES with layers of canonicalised fragment SMILES in some way?
        --> baoilleach @nathanbroon They call it a reduced graph. You should visit Sheffield some time.
            --> nathanbroon @baoilleach I call them feature trees…
            --> nathanbroon @baoilleach I mean having the substructure SMILES still explicitly defined in the SMILES.
            --> baoilleach @nathanbroon But srsly, a reduced graph does essentially that. I cite a paper where they've done that and it's worth pursuing. But in general, it's an interesting idea to do this for SMILES, e.g. change the traversal order so that certain substructures are always the same
                --> nathanbroon @baoilleach But the traversal order to canonicalise one substructure will necessarily change the canonical form of others as you navigate the structure, no?
                --> nathanbroon @baoilleach I have a painful representation in my head but it’s ugly. Need to think about this more.
                    --> baoilleach @nathanbroon Well, you can do it for terminal groups at least.
                        --> nathanbroon @baoilleach For sure

 RSC_CICAG Noel: the forgotten talks… #RDKitUGM2018 https://t.co/ApefFW1ihL
--> baoilleach @RSC_CICAG Sounds like a fantasy epic .

baoilleach #RDKitUGM2018 +1 from me for the change from elitist selection to tournament/roulette/ranked. The whole point of a GA is to avoid local minima, and elitist selection heads straight for it.

baoilleach #RDKitUGM2018 A reference to Ghose filters as well as Lipinski.

A key theme is that the code is more extensible - can now add your own filters, reactions, etc.
--> baoilleach #RDKitUGM2018 +1 from me for the change from elitist selection to tournament/roulette/ranked. The whole point of a GA is to avoid local minima, and elitist selection heads straight for it.

iwatobipen New version of AutoGrow
- new version does ligand handling with 1D smiles
- All calculation for mutation, crossover, and filtering are now done in rdkit
 #RDKitUGM2018

 RSC_CICAG Recent publication on AutoGrow: https://t.co/8ywijrmziK #RDKitUGM2018

CzodrowskiPaul Jacob Spiegel from Durrant lab (https://t.co/xsaaWZWTle) talks about AutoGrow https://t.co/CB6m5qSvgg #RDKitUGM2018

 RSC_CICAG Next up is Jacob Spiegel from Pittsburgh talking about more de novo design. This time with genetic algorithms. #RDKitUGM2018 https://t.co/l4pNQuRsTf
-->  RSC_CICAG Recent publication on AutoGrow: https://t.co/8ywijrmziK #RDKitUGM2018

baoilleach #RDKitUGM2018 Jacob Spiegel from Durrant Lab (Uni of Pittsburgh) describing AutoGrow 4.0. Durrant Lab Mission: developing sci comp tools that are free, open-source and accessible to non-programmers.
--> baoilleach Has a module for conversion from SMILES to 3D (Gypsum), and also separately a protonation module. These will be released as open source in the next while.
    --> CzodrowskiPaul @baoilleach you are always a few seconds quicker than me 🙂
    --> baoilleach #RDKitUGM2018 A reference to Ghose filters as well as Lipinski.

A key theme is that the code is more extensible - can now add your own filters, reactions, etc.
        --> baoilleach #RDKitUGM2018 +1 from me for the change from elitist selection to tournament/roulette/ranked. The whole point of a GA is to avoid local minima, and elitist selection heads straight for it.

iwatobipen unsolved challenges,
no condition prediction,
no e.r./d/r/ yield prediction, natural product do not work yet,
And one more(missed...)
 #RDKitUGM2018

baoilleach #RDKitUGM2018 How to test the quality? Really quite challenging. Chemical Turing Test. Showed routes to chemists in a double blind way and asked the chemists which did they prefer. The difference is not statistically significant (ed: but that's just a feature of the dataset size)
--> baoilleach I'll tweet my question: it's a different problem, but what do you think of relaxing the constraint on the start state? That is, if a very similar molecule could be made easily (e.g. short route), show that route instead.
    --> CzodrowskiPaul @baoilleach Nice question, you should have asked it - or maybe I was just quicker with my question.. (if this is the case: apologies!)
        --> baoilleach @CzodrowskiPaul It just was that I didn't get a chance. Here's another Q for @marwinsegler: you said that 3TB of data after three steps going forward. So you should totally do that, and use it as a lookup when doing the retrosyn. If you're interested in fast disk-based lookup of TBs, that's us.
            --> baoilleach @CzodrowskiPaul @marwinsegler Actually, that's not a question is it? Oh well.
            --> marwinsegler @baoilleach @CzodrowskiPaul This is just an estimation based on b^d (b: branching factor, d: depth) Since I don’t have even a 1TB disk, I never actually tried. And you also don’t need to, because when you use techniques described later in the talk, you can prune the search tree quite a bit :)
                --> baoilleach @marwinsegler @CzodrowskiPaul Don't forget that this is a pruning strategy too, which can be combined with others. If the user specifies a maximum of 4 steps, you can prune after a single retrosyn step if not in the 3TB db.

iwatobipen MCTS(PUCT)shows the best performance! #RDKitUGM2018 https://t.co/56R8nLphdw
--> iwatobipen And also very fast.

baoilleach #RDKitUGM2018 Bias the exploration (vs exploitation) term by extending UCT to PUCT. Compares a variety of methods on Chematica data from 2016. 13s per molecule in the end, 95% of molecules solved.
--> baoilleach #RDKitUGM2018 How to test the quality? Really quite challenging. Chemical Turing Test. Showed routes to chemists in a double blind way and asked the chemists which did they prefer. The difference is not statistically significant (ed: but that's just a feature of the dataset size)
    --> baoilleach I'll tweet my question: it's a different problem, but what do you think of relaxing the constraint on the start state? That is, if a very similar molecule could be made easily (e.g. short route), show that route instead.
        --> CzodrowskiPaul @baoilleach Nice question, you should have asked it - or maybe I was just quicker with my question.. (if this is the case: apologies!)
            --> baoilleach @CzodrowskiPaul It just was that I didn't get a chance. Here's another Q for @marwinsegler: you said that 3TB of data after three steps going forward. So you should totally do that, and use it as a lookup when doing the retrosyn. If you're interested in fast disk-based lookup of TBs, that's us.
                --> baoilleach @CzodrowskiPaul @marwinsegler Actually, that's not a question is it? Oh well.
                --> marwinsegler @baoilleach @CzodrowskiPaul This is just an estimation based on b^d (b: branching factor, d: depth) Since I don’t have even a 1TB disk, I never actually tried. And you also don’t need to, because when you use techniques described later in the talk, you can prune the search tree quite a bit :)
                    --> baoilleach @marwinsegler @CzodrowskiPaul Don't forget that this is a pruning strategy too, which can be combined with others. If the user specifies a maximum of 4 steps, you can prune after a single retrosyn step if not in the 3TB db.

baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 We peopose to use Markov Decision Process. Use Monte Carlo Tree Search (MCTS), an idea introduced in 2006. Shows @nmsoftware logo to illustrate the multi-armed bandit approach.

baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 The classical way is heuristic best first search. Alternative is proof number search from Heifets. Corey (1968) and Vleduts (1963) both suggested writing down all of chem knowledge in logic form.
--> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 We peopose to use Markov Decision Process. Use Monte Carlo Tree Search (MCTS), an idea introduced in 2006. Shows @nmsoftware logo to illustrate the multi-armed bandit approach.

CzodrowskiPaul @marwinsegler from @benevolent_ai on stage and his comments on (retro-)synthesis
- Chemists disagree about good solutions
- Synthesis only solved at the end
- Molecular complexity needs to be tactically increased (Protecting groups!)
#RDKitUGM2018

baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Do we want the global optimum? Probably not - instead a satisfactory and sufficing solution ("satisficing"). We only care about the state we start with. Value fn needs to be computed "online" - as building blocks may become unavailable overnight.
--> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 The classical way is heuristic best first search. Alternative is proof number search from Heifets. Corey (1968) and Vleduts (1963) both suggested writing down all of chem knowledge in logic form.
    --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 We peopose to use Markov Decision Process. Use Monte Carlo Tree Search (MCTS), an idea introduced in 2006. Shows @nmsoftware logo to illustrate the multi-armed bandit approach.

baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Is retrosyn more like chess endgames or like a maze? Is it a sparse reward search tree? Only one path to the solution. Or dense? Lots and different solutions. Goal state known? Notion of lost states.
--> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Do we want the global optimum? Probably not - instead a satisfactory and sufficing solution ("satisficing"). We only care about the state we start with. Value fn needs to be computed "online" - as building blocks may become unavailable overnight.
    --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 The classical way is heuristic best first search. Alternative is proof number search from Heifets. Corey (1968) and Vleduts (1963) both suggested writing down all of chem knowledge in logic form.
        --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 We peopose to use Markov Decision Process. Use Monte Carlo Tree Search (MCTS), an idea introduced in 2006. Shows @nmsoftware logo to illustrate the multi-armed bandit approach.

baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Compares retrosyn to games such as chess and go. The big difference are the rules. Trivial and fixed versus complex, unknown and expanding.

Aside: mentions that failed rxns illustrate cases where the chemist thought they were going to work, and so cannot explain.
--> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Is retrosyn more like chess endgames or like a maze? Is it a sparse reward search tree? Only one path to the solution. Or dense? Lots and different solutions. Goal state known? Notion of lost states.
    --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Do we want the global optimum? Probably not - instead a satisfactory and sufficing solution ("satisficing"). We only care about the state we start with. Value fn needs to be computed "online" - as building blocks may become unavailable overnight.
        --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 The classical way is heuristic best first search. Alternative is proof number search from Heifets. Corey (1968) and Vleduts (1963) both suggested writing down all of chem knowledge in logic form.
            --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 We peopose to use Markov Decision Process. Use Monte Carlo Tree Search (MCTS), an idea introduced in 2006. Shows @nmsoftware logo to illustrate the multi-armed bandit approach.

baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 If you apply all rules, then after 3 steps at 100TB. Cognitive science: dual process theory - focus on salient objects first quickly, then slowly a more reasoned process.
--> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Compares retrosyn to games such as chess and go. The big difference are the rules. Trivial and fixed versus complex, unknown and expanding.

Aside: mentions that failed rxns illustrate cases where the chemist thought they were going to work, and so cannot explain.
    --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Is retrosyn more like chess endgames or like a maze? Is it a sparse reward search tree? Only one path to the solution. Or dense? Lots and different solutions. Goal state known? Notion of lost states.
        --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Do we want the global optimum? Probably not - instead a satisfactory and sufficing solution ("satisficing"). We only care about the state we start with. Value fn needs to be computed "online" - as building blocks may become unavailable overnight.
            --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 The classical way is heuristic best first search. Alternative is proof number search from Heifets. Corey (1968) and Vleduts (1963) both suggested writing down all of chem knowledge in logic form.
                --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 We peopose to use Markov Decision Process. Use Monte Carlo Tree Search (MCTS), an idea introduced in 2006. Shows @nmsoftware logo to illustrate the multi-armed bandit approach.

baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Focusing today on efficient search. Shows my favourite molecule c1ccccc1C(=O)Cl
--> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 If you apply all rules, then after 3 steps at 100TB. Cognitive science: dual process theory - focus on salient objects first quickly, then slowly a more reasoned process.
    --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Compares retrosyn to games such as chess and go. The big difference are the rules. Trivial and fixed versus complex, unknown and expanding.

Aside: mentions that failed rxns illustrate cases where the chemist thought they were going to work, and so cannot explain.
        --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Is retrosyn more like chess endgames or like a maze? Is it a sparse reward search tree? Only one path to the solution. Or dense? Lots and different solutions. Goal state known? Notion of lost states.
            --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Do we want the global optimum? Probably not - instead a satisfactory and sufficing solution ("satisficing"). We only care about the state we start with. Value fn needs to be computed "online" - as building blocks may become unavailable overnight.
                --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 The classical way is heuristic best first search. Alternative is proof number search from Heifets. Corey (1968) and Vleduts (1963) both suggested writing down all of chem knowledge in logic form.
                    --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 We peopose to use Markov Decision Process. Use Monte Carlo Tree Search (MCTS), an idea introduced in 2006. Shows @nmsoftware logo to illustrate the multi-armed bandit approach.

baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Challenges: 1. get the rules of chemistry into the machine, 2. prioritise retro steps, 3. predict forward rxns, 4. efficient search. Refs Segler Nature 2018, 555, 604. Planning with DNNs.
--> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Focusing today on efficient search. Shows my favourite molecule c1ccccc1C(=O)Cl
    --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 If you apply all rules, then after 3 steps at 100TB. Cognitive science: dual process theory - focus on salient objects first quickly, then slowly a more reasoned process.
        --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Compares retrosyn to games such as chess and go. The big difference are the rules. Trivial and fixed versus complex, unknown and expanding.

Aside: mentions that failed rxns illustrate cases where the chemist thought they were going to work, and so cannot explain.
            --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Is retrosyn more like chess endgames or like a maze? Is it a sparse reward search tree? Only one path to the solution. Or dense? Lots and different solutions. Goal state known? Notion of lost states.
                --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Do we want the global optimum? Probably not - instead a satisfactory and sufficing solution ("satisficing"). We only care about the state we start with. Value fn needs to be computed "online" - as building blocks may become unavailable overnight.
                    --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 The classical way is heuristic best first search. Alternative is proof number search from Heifets. Corey (1968) and Vleduts (1963) both suggested writing down all of chem knowledge in logic form.
                        --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 We peopose to use Markov Decision Process. Use Monte Carlo Tree Search (MCTS), an idea introduced in 2006. Shows @nmsoftware logo to illustrate the multi-armed bandit approach.

baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Shows retrosyn tree from Gini Chem Eur J 2015, 21, 12053. Goal is to decompose until you find building blocks where the mols are purchasable (and on stock).
--> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Challenges: 1. get the rules of chemistry into the machine, 2. prioritise retro steps, 3. predict forward rxns, 4. efficient search. Refs Segler Nature 2018, 555, 604. Planning with DNNs.
    --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Focusing today on efficient search. Shows my favourite molecule c1ccccc1C(=O)Cl
        --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 If you apply all rules, then after 3 steps at 100TB. Cognitive science: dual process theory - focus on salient objects first quickly, then slowly a more reasoned process.
            --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Compares retrosyn to games such as chess and go. The big difference are the rules. Trivial and fixed versus complex, unknown and expanding.

Aside: mentions that failed rxns illustrate cases where the chemist thought they were going to work, and so cannot explain.
                --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Is retrosyn more like chess endgames or like a maze? Is it a sparse reward search tree? Only one path to the solution. Or dense? Lots and different solutions. Goal state known? Notion of lost states.
                    --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Do we want the global optimum? Probably not - instead a satisfactory and sufficing solution ("satisficing"). We only care about the state we start with. Value fn needs to be computed "online" - as building blocks may become unavailable overnight.
                        --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 The classical way is heuristic best first search. Alternative is proof number search from Heifets. Corey (1968) and Vleduts (1963) both suggested writing down all of chem knowledge in logic form.
                            --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 We peopose to use Markov Decision Process. Use Monte Carlo Tree Search (MCTS), an idea introduced in 2006. Shows @nmsoftware logo to illustrate the multi-armed bandit approach.

baoilleach @marwinsegler #RDKitUGM2018 Uses a combination of RDKit and CDK.

We should care about comput aided syn planning because we are staying in a synthesis comfort zone. Refs @DrBostrom's paper.
--> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Shows retrosyn tree from Gini Chem Eur J 2015, 21, 12053. Goal is to decompose until you find building blocks where the mols are purchasable (and on stock).
    --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Challenges: 1. get the rules of chemistry into the machine, 2. prioritise retro steps, 3. predict forward rxns, 4. efficient search. Refs Segler Nature 2018, 555, 604. Planning with DNNs.
        --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Focusing today on efficient search. Shows my favourite molecule c1ccccc1C(=O)Cl
            --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 If you apply all rules, then after 3 steps at 100TB. Cognitive science: dual process theory - focus on salient objects first quickly, then slowly a more reasoned process.
                --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Compares retrosyn to games such as chess and go. The big difference are the rules. Trivial and fixed versus complex, unknown and expanding.

Aside: mentions that failed rxns illustrate cases where the chemist thought they were going to work, and so cannot explain.
                    --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Is retrosyn more like chess endgames or like a maze? Is it a sparse reward search tree? Only one path to the solution. Or dense? Lots and different solutions. Goal state known? Notion of lost states.
                        --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Do we want the global optimum? Probably not - instead a satisfactory and sufficing solution ("satisficing"). We only care about the state we start with. Value fn needs to be computed "online" - as building blocks may become unavailable overnight.
                            --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 The classical way is heuristic best first search. Alternative is proof number search from Heifets. Corey (1968) and Vleduts (1963) both suggested writing down all of chem knowledge in logic form.
                                --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 We peopose to use Markov Decision Process. Use Monte Carlo Tree Search (MCTS), an idea introduced in 2006. Shows @nmsoftware logo to illustrate the multi-armed bandit approach.

 RSC_CICAG This is Marwin’s @nature paper: https://t.co/m9XbddX1FA #RDKitUGM2018 @benevolent_ai @marwinsegler #RDKit
-->  RSC_CICAG Marwin (@marwinsegler) also spoke at our #RSC_AIChem conference earlier this year.

 RSC_CICAG Search Algorithms for Computer Aided Synthesis Planning by @marwinsegler from @benevolent_ai. #RDKitUGM2018 https://t.co/lG1VzEubsR
-->  RSC_CICAG This is Marwin’s @nature paper: https://t.co/m9XbddX1FA #RDKitUGM2018 @benevolent_ai @marwinsegler #RDKit
    -->  RSC_CICAG Marwin (@marwinsegler) also spoke at our #RSC_AIChem conference earlier this year.

baoilleach #RDKitUGM2018 Marwin Segler (@marwinsegler) on Search algorithms for comput aided synthesis planning
--> baoilleach @marwinsegler #RDKitUGM2018 Uses a combination of RDKit and CDK.

We should care about comput aided syn planning because we are staying in a synthesis comfort zone. Refs @DrBostrom's paper.
    --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Shows retrosyn tree from Gini Chem Eur J 2015, 21, 12053. Goal is to decompose until you find building blocks where the mols are purchasable (and on stock).
        --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Challenges: 1. get the rules of chemistry into the machine, 2. prioritise retro steps, 3. predict forward rxns, 4. efficient search. Refs Segler Nature 2018, 555, 604. Planning with DNNs.
            --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Focusing today on efficient search. Shows my favourite molecule c1ccccc1C(=O)Cl
                --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 If you apply all rules, then after 3 steps at 100TB. Cognitive science: dual process theory - focus on salient objects first quickly, then slowly a more reasoned process.
                    --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Compares retrosyn to games such as chess and go. The big difference are the rules. Trivial and fixed versus complex, unknown and expanding.

Aside: mentions that failed rxns illustrate cases where the chemist thought they were going to work, and so cannot explain.
                        --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Is retrosyn more like chess endgames or like a maze? Is it a sparse reward search tree? Only one path to the solution. Or dense? Lots and different solutions. Goal state known? Notion of lost states.
                            --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 Do we want the global optimum? Probably not - instead a satisfactory and sufficing solution ("satisficing"). We only care about the state we start with. Value fn needs to be computed "online" - as building blocks may become unavailable overnight.
                                --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 The classical way is heuristic best first search. Alternative is proof number search from Heifets. Corey (1968) and Vleduts (1963) both suggested writing down all of chem knowledge in logic form.
                                    --> baoilleach @marwinsegler @DrBostrom #RDKitUGM2018 We peopose to use Markov Decision Process. Use Monte Carlo Tree Search (MCTS), an idea introduced in 2006. Shows @nmsoftware logo to illustrate the multi-armed bandit approach.

CzodrowskiPaul @dr_greg_landrum is unbelievable: within 2 minutes, he showed amazing features in Slack: Depiction of SMILES (even 3D, but not shown on the screenshot) #RDKitUGM2018 https://t.co/wDIL3hrVjK

   axelp69 BTW, @CzodrowskiPaul , @dr_greg_landrum , @iwatobipen and all the others, thanks for being so active on Twitter during #RDkitUGM2018. Makes me (almost) feel like I'm there. 😀

iwatobipen Property calculation and visualize mol in slack! #RDKitUGM2018 https://t.co/zJjDog953M

baoilleach #RDKitUGM2018 I didn't get a chance to present my lightening talk on DeepSMILES, so here it is:

https://t.co/SZAVKdIRsA
--> dr_greg_landrum @baoilleach Shit... did I forget you or did you not tell me about this?
--> nathanbroon @baoilleach That’s super cool, Noel. Will take a read of your preprint later. #RDKitUGM2018
    --> baoilleach @nathanbroon Yeah, later. No but that's okay. I'm cool with that.
        --> nathanbroon @baoilleach Just read it and pleased you can roundtrip without error. Just wondering if you’ve thought about the canonicalisation of the parent molecule leading to non-canonical substructures? Could you represent the SMILES with layers of canonicalised fragment SMILES in some way?
            --> baoilleach @nathanbroon They call it a reduced graph. You should visit Sheffield some time.
                --> nathanbroon @baoilleach I call them feature trees…
                --> nathanbroon @baoilleach I mean having the substructure SMILES still explicitly defined in the SMILES.
                --> baoilleach @nathanbroon But srsly, a reduced graph does essentially that. I cite a paper where they've done that and it's worth pursuing. But in general, it's an interesting idea to do this for SMILES, e.g. change the traversal order so that certain substructures are always the same
                    --> nathanbroon @baoilleach But the traversal order to canonicalise one substructure will necessarily change the canonical form of others as you navigate the structure, no?
                    --> nathanbroon @baoilleach I have a painful representation in my head but it’s ugly. Need to think about this more.
                        --> baoilleach @nathanbroon Well, you can do it for terminal groups at least.
                            --> nathanbroon @baoilleach For sure
    --> baoilleach @dr_greg_landrum I guess the first? But there were already a lot of lightening talks from people who weren't giving talks.
    --> CzodrowskiPaul @dr_greg_landrum @baoilleach I apologize to ask for the opportunity for asking questions!
        --> baoilleach @CzodrowskiPaul @dr_greg_landrum I will not bear a grudge. I will not bear a grudge. I will not bear a grudge. (I'll keep repeating until it works)
            --> dr_greg_landrum @baoilleach @CzodrowskiPaul what do you have against grudges?
                --> baoilleach @dr_greg_landrum @CzodrowskiPaul I've just got so many right now.

iwatobipen Next presentation is "Computer aided synthesis planning" ! #RDKitUGM2018

CzodrowskiPaul #RDKitUGM2018 here is the RDedit code by @ChemITnerf : https://t.co/hutLF61Oha
-->    axelp69 @CzodrowskiPaul @ChemITnerf Damn, I am beginning to really regret that I did not go. Would love to have met Esben.

CzodrowskiPaul Unbelievable work by @ChemITnerf : RDedit entirely RDKit empowered molecule editor #RDKitUGM2018 https://t.co/jKgEIETtCn
--> markussitzmann @CzodrowskiPaul @ChemITnerf Any word about availability? 😀
    --> baoilleach @markussitzmann @CzodrowskiPaul @ChemITnerf Up on his github page.

CzodrowskiPaul Japan's RDKit poweruser Iwato @iwatobipen  https://t.co/i1G3fe5rki  shows his clustering approach for large data sets #RDKitUGM2018
-->  abhik1368 @CzodrowskiPaul @iwatobipen Are you presenting umap and tsne ? if yes, in tsne how are you handling perplexity ?
    --> iwatobipen @abhik1368 @CzodrowskiPaul I talked about fast cluster. I will share my slides soon.

baoilleach @iwatobipen #RDKitUGM2018 Bayon is a library for clustering large datasets. 17 mins for fp calc for ChEMBL24. 18 min for clustering. (On macbook pro)

baoilleach #RDKitUGM2018 @iwatobipen (missed full name - can you Tweet?) presenting clustering large set of molecules. Many RDKit users in Japan. Member of *J-CLIC - Japen cmpd Library Consortium. &gt;150K cmpds purchased over 5 years.
--> baoilleach @iwatobipen #RDKitUGM2018 Bayon is a library for clustering large datasets. 17 mins for fp calc for ChEMBL24. 18 min for clustering. (On macbook pro)
--> iwatobipen @baoilleach Thanks. My name is Takayuki Serizawa.

CzodrowskiPaul Andrew Dalke shows the https://t.co/q5JZTGzXjZ : only a bit more than a year for Python2 support. @dr_greg_landrum already does not add new features to the Python2 @RDKit_org version  #RDKitUGM2018 https://t.co/HJOzhOzYT7
-->    axelp69 @CzodrowskiPaul @dr_greg_landrum @RDKit_org Yup, it's time to switch.
    --> CzodrowskiPaul @axelp69 @dr_greg_landrum @RDKit_org There only very few hands (including mine) that were raised when asking for Python2 users
        -->    axelp69 @CzodrowskiPaul @dr_greg_landrum @RDKit_org Yeah, I am exclusively using Python 3 for two years now.

CzodrowskiPaul One powerful feature in scikit-learn (at least new to me): there is a functionality for time-split crossvalidation in @scikit_learn:
https://t.co/OjOEGiKxDA
Kudos to Lewis Mervin @LPrezidente during his lovely presentation about target prediction at the #RDKitUGM2018

baoilleach #RDKitUGM2018 Ref to Aniceto. J Cheminf. 8.1 2016 69. Includes a correction for the bias with measuring distance to nearest nbrs.

baoilleach #RDKitUGM2018 Measurement of applicability domain (AD) via distance to training set structures is not usually not a good correlation with error rate. AD is context-specific. Local characteristics must be considered. May be poor data or poor density.
-->   jcheminf @baoilleach indeed, online journals do not number article pages by volume. More info can be found under "Citing articles in Journal of Cheminformatics" at https://t.co/CT1htM2fNR

CzodrowskiPaul #RDKitUGM2018 One comment after the wonderful presentation by Ben Tehan &amp; Rob Smith (both from @soseiheptaresco) by Oliver Smart: "This is bad Python style", i.e.
except: None https://t.co/fRTrurzI15

iwatobipen PIDGIN Version 2: Prediction IncluDinG INactivity Version 2
https://t.co/VUbw01fRP8 #RDKitUGM2018
--> iwatobipen https://t.co/qIZIQpxlu5

baoilleach #RDKitUGM2018 PIDGIN - "prediction including inactivity". Mervin 2018 J Cheminf 7 51 (ed: citation page numbers incorrect on slide)
--> baoilleach FYI @jcheminf, this is an example in the wild of people citing the page numbers in the PDF (1-16) rather than the article number. Probably quite common - I've almost done it myself.
--> baoilleach #RDKitUGM2018 Measurement of applicability domain (AD) via distance to training set structures is not usually not a good correlation with error rate. AD is context-specific. Local characteristics must be considered. May be poor data or poor density.
    -->   jcheminf @baoilleach indeed, online journals do not number article pages by volume. More info can be found under "Citing articles in Journal of Cheminformatics" at https://t.co/CT1htM2fNR

 RSC_CICAG The source code from Lewis is here: https://t.co/1zl6tYaOUb #RDKitUGM2018

 RSC_CICAG Last talk of the morning from Natalia Aniceto and Lewis Mervin on protein-target predictions. #RDKitUGM2018 https://t.co/LI5j2H2gm5
-->  RSC_CICAG The source code from Lewis is here: https://t.co/1zl6tYaOUb #RDKitUGM2018

baoilleach #RDKitUGM2018 Natalia Aniceto and Lewis Mervin on In silico protein target pred with reliability-density nbrhood analysis
-->      rguha @baoilleach So they mention how fast ot is?
--> baoilleach #RDKitUGM2018 PIDGIN - "prediction including inactivity". Mervin 2018 J Cheminf 7 51 (ed: citation page numbers incorrect on slide)
    --> baoilleach FYI @jcheminf, this is an example in the wild of people citing the page numbers in the PDF (1-16) rather than the article number. Probably quite common - I've almost done it myself.
    --> baoilleach #RDKitUGM2018 Measurement of applicability domain (AD) via distance to training set structures is not usually not a good correlation with error rate. AD is context-specific. Local characteristics must be considered. May be poor data or poor density.
        -->   jcheminf @baoilleach indeed, online journals do not number article pages by volume. More info can be found under "Citing articles in Journal of Cheminformatics" at https://t.co/CT1htM2fNR
    --> baoilleach @rguha I'll listen out
        --> baoilleach @rguha No and I'm not sure I picked up enough to be confident of asking a question about it.

baoilleach @iwatobipen #RDKitUGM2018 Matched series analysis. Free-Wilson analysis. R Group decomp, R group relative ranking. What's the penalty going to be associated with a change in R group.

baoilleach @iwatobipen #RDKitUGM2018 Greg has done bioisosteric replacement blogpost Feb 2018.  Rob talking now about haloperidol as example for his implementation of MMPA.
--> baoilleach @iwatobipen #RDKitUGM2018 Matched series analysis. Free-Wilson analysis. R Group decomp, R group relative ranking. What's the penalty going to be associated with a change in R group.

baoilleach @iwatobipen #RDKitUGM2018 Shows humongous SQL statement to pull out all transforms from the mmpdb database. Very easy to pull out the transform, the context, cluster them, and show everything.
--> baoilleach @iwatobipen #RDKitUGM2018 Greg has done bioisosteric replacement blogpost Feb 2018.  Rob talking now about haloperidol as example for his implementation of MMPA.
    --> baoilleach @iwatobipen #RDKitUGM2018 Matched series analysis. Free-Wilson analysis. R Group decomp, R group relative ranking. What's the penalty going to be associated with a change in R group.

baoilleach @iwatobipen #RDKitUGM2018 Wagener the quest for bioisostereic replacements JCIM. Hussain+Rea MMPs shoutout. mmpdb shoutout from Andrew Dalke.
--> baoilleach @iwatobipen #RDKitUGM2018 Shows humongous SQL statement to pull out all transforms from the mmpdb database. Very easy to pull out the transform, the context, cluster them, and show everything.
    --> baoilleach @iwatobipen #RDKitUGM2018 Greg has done bioisosteric replacement blogpost Feb 2018.  Rob talking now about haloperidol as example for his implementation of MMPA.
        --> baoilleach @iwatobipen #RDKitUGM2018 Matched series analysis. Free-Wilson analysis. R Group decomp, R group relative ranking. What's the penalty going to be associated with a change in R group.

baoilleach @iwatobipen #RDKitUGM2018 Use Django webserver to do the PAINs filters and so forth. Exposed to users through Vortex.

Patent analysis - "fixing" sparse data. Do we see trends for stability in the amines?
--> baoilleach @iwatobipen #RDKitUGM2018 Wagener the quest for bioisostereic replacements JCIM. Hussain+Rea MMPs shoutout. mmpdb shoutout from Andrew Dalke.
    --> baoilleach @iwatobipen #RDKitUGM2018 Shows humongous SQL statement to pull out all transforms from the mmpdb database. Very easy to pull out the transform, the context, cluster them, and show everything.
        --> baoilleach @iwatobipen #RDKitUGM2018 Greg has done bioisosteric replacement blogpost Feb 2018.  Rob talking now about haloperidol as example for his implementation of MMPA.
            --> baoilleach @iwatobipen #RDKitUGM2018 Matched series analysis. Free-Wilson analysis. R Group decomp, R group relative ranking. What's the penalty going to be associated with a change in R group.

baoilleach @iwatobipen #RDKitUGM2018 Compared different alignment methods from GPCR bench dataset. ROCs vs Align-it, AlignMol, GetO3A, GetCrippenO3A.
--> baoilleach @iwatobipen #RDKitUGM2018 Use Django webserver to do the PAINs filters and so forth. Exposed to users through Vortex.

Patent analysis - "fixing" sparse data. Do we see trends for stability in the amines?
    --> baoilleach @iwatobipen #RDKitUGM2018 Wagener the quest for bioisostereic replacements JCIM. Hussain+Rea MMPs shoutout. mmpdb shoutout from Andrew Dalke.
        --> baoilleach @iwatobipen #RDKitUGM2018 Shows humongous SQL statement to pull out all transforms from the mmpdb database. Very easy to pull out the transform, the context, cluster them, and show everything.
            --> baoilleach @iwatobipen #RDKitUGM2018 Greg has done bioisosteric replacement blogpost Feb 2018.  Rob talking now about haloperidol as example for his implementation of MMPA.
                --> baoilleach @iwatobipen #RDKitUGM2018 Matched series analysis. Free-Wilson analysis. R Group decomp, R group relative ranking. What's the penalty going to be associated with a change in R group.

baoilleach @iwatobipen #RDKitUGM2018 Crippen based atom type alignments seems to work quite well, compared to O3A.

Tech note: use Python "futures" to distribute jobs over multiple processors.
--> baoilleach @iwatobipen #RDKitUGM2018 Compared different alignment methods from GPCR bench dataset. ROCs vs Align-it, AlignMol, GetO3A, GetCrippenO3A.
    --> baoilleach @iwatobipen #RDKitUGM2018 Use Django webserver to do the PAINs filters and so forth. Exposed to users through Vortex.

Patent analysis - "fixing" sparse data. Do we see trends for stability in the amines?
        --> baoilleach @iwatobipen #RDKitUGM2018 Wagener the quest for bioisostereic replacements JCIM. Hussain+Rea MMPs shoutout. mmpdb shoutout from Andrew Dalke.
            --> baoilleach @iwatobipen #RDKitUGM2018 Shows humongous SQL statement to pull out all transforms from the mmpdb database. Very easy to pull out the transform, the context, cluster them, and show everything.
                --> baoilleach @iwatobipen #RDKitUGM2018 Greg has done bioisosteric replacement blogpost Feb 2018.  Rob talking now about haloperidol as example for his implementation of MMPA.
                    --> baoilleach @iwatobipen #RDKitUGM2018 Matched series analysis. Free-Wilson analysis. R Group decomp, R group relative ranking. What's the penalty going to be associated with a change in R group.

iwatobipen RDKit's shape-based alignments are nice. #RDKitUGM2018

baoilleach #RDKitUGM2018 Shoutout to @iwatobipen 's blog on similarity/shape alignment.

Assessing alignments: ShapeTanimotoDist and ShapeProtrudeDist.
--> baoilleach @iwatobipen #RDKitUGM2018 Crippen based atom type alignments seems to work quite well, compared to O3A.

Tech note: use Python "futures" to distribute jobs over multiple processors.
    --> baoilleach @iwatobipen #RDKitUGM2018 Compared different alignment methods from GPCR bench dataset. ROCs vs Align-it, AlignMol, GetO3A, GetCrippenO3A.
        --> baoilleach @iwatobipen #RDKitUGM2018 Use Django webserver to do the PAINs filters and so forth. Exposed to users through Vortex.

Patent analysis - "fixing" sparse data. Do we see trends for stability in the amines?
            --> baoilleach @iwatobipen #RDKitUGM2018 Wagener the quest for bioisostereic replacements JCIM. Hussain+Rea MMPs shoutout. mmpdb shoutout from Andrew Dalke.
                --> baoilleach @iwatobipen #RDKitUGM2018 Shows humongous SQL statement to pull out all transforms from the mmpdb database. Very easy to pull out the transform, the context, cluster them, and show everything.
                    --> baoilleach @iwatobipen #RDKitUGM2018 Greg has done bioisosteric replacement blogpost Feb 2018.  Rob talking now about haloperidol as example for his implementation of MMPA.
                        --> baoilleach @iwatobipen #RDKitUGM2018 Matched series analysis. Free-Wilson analysis. R Group decomp, R group relative ranking. What's the penalty going to be associated with a change in R group.

baoilleach #RDKitUGM2018 Nice to use a knowledge-driven approach to drive forward, rather than a patent-busting one, e.g. MACCS similarity, then ECFP4 similarity. Shoutout to my paper  https://t.co/iaFbaPIr7f
--> baoilleach #RDKitUGM2018 Shoutout to @iwatobipen 's blog on similarity/shape alignment.

Assessing alignments: ShapeTanimotoDist and ShapeProtrudeDist.
    --> baoilleach @iwatobipen #RDKitUGM2018 Crippen based atom type alignments seems to work quite well, compared to O3A.

Tech note: use Python "futures" to distribute jobs over multiple processors.
        --> baoilleach @iwatobipen #RDKitUGM2018 Compared different alignment methods from GPCR bench dataset. ROCs vs Align-it, AlignMol, GetO3A, GetCrippenO3A.
            --> baoilleach @iwatobipen #RDKitUGM2018 Use Django webserver to do the PAINs filters and so forth. Exposed to users through Vortex.

Patent analysis - "fixing" sparse data. Do we see trends for stability in the amines?
                --> baoilleach @iwatobipen #RDKitUGM2018 Wagener the quest for bioisostereic replacements JCIM. Hussain+Rea MMPs shoutout. mmpdb shoutout from Andrew Dalke.
                    --> baoilleach @iwatobipen #RDKitUGM2018 Shows humongous SQL statement to pull out all transforms from the mmpdb database. Very easy to pull out the transform, the context, cluster them, and show everything.
                        --> baoilleach @iwatobipen #RDKitUGM2018 Greg has done bioisosteric replacement blogpost Feb 2018.  Rob talking now about haloperidol as example for his implementation of MMPA.
                            --> baoilleach @iwatobipen #RDKitUGM2018 Matched series analysis. Free-Wilson analysis. R Group decomp, R group relative ranking. What's the penalty going to be associated with a change in R group.

baoilleach #RDKitUGM2018 FOG analyses run every night- Frequency of R groups in Surechembl (Tyrchan JCIM). Uses Fraggle to find parts similar in one area but may be quite different in another.
--> baoilleach #RDKitUGM2018 Nice to use a knowledge-driven approach to drive forward, rather than a patent-busting one, e.g. MACCS similarity, then ECFP4 similarity. Shoutout to my paper  https://t.co/iaFbaPIr7f
    --> baoilleach #RDKitUGM2018 Shoutout to @iwatobipen 's blog on similarity/shape alignment.

Assessing alignments: ShapeTanimotoDist and ShapeProtrudeDist.
        --> baoilleach @iwatobipen #RDKitUGM2018 Crippen based atom type alignments seems to work quite well, compared to O3A.

Tech note: use Python "futures" to distribute jobs over multiple processors.
            --> baoilleach @iwatobipen #RDKitUGM2018 Compared different alignment methods from GPCR bench dataset. ROCs vs Align-it, AlignMol, GetO3A, GetCrippenO3A.
                --> baoilleach @iwatobipen #RDKitUGM2018 Use Django webserver to do the PAINs filters and so forth. Exposed to users through Vortex.

Patent analysis - "fixing" sparse data. Do we see trends for stability in the amines?
                    --> baoilleach @iwatobipen #RDKitUGM2018 Wagener the quest for bioisostereic replacements JCIM. Hussain+Rea MMPs shoutout. mmpdb shoutout from Andrew Dalke.
                        --> baoilleach @iwatobipen #RDKitUGM2018 Shows humongous SQL statement to pull out all transforms from the mmpdb database. Very easy to pull out the transform, the context, cluster them, and show everything.
                            --> baoilleach @iwatobipen #RDKitUGM2018 Greg has done bioisosteric replacement blogpost Feb 2018.  Rob talking now about haloperidol as example for his implementation of MMPA.
                                --> baoilleach @iwatobipen #RDKitUGM2018 Matched series analysis. Free-Wilson analysis. R Group decomp, R group relative ranking. What's the penalty going to be associated with a change in R group.

 RSC_CICAG Now a talk from Ben Tehan &amp; Rob Smith (Heptares) on how the #RDKit is used in biotech. #RDKitUGM2018 https://t.co/8n3gGcnJ9D

baoilleach #RDKitUGM2018 Ben Tehan and Rob Smith (Heptares) on RDKit in the Modern Biotech
--> baoilleach Initially multitude of languages, webservers and toolkits. 2013 moved to a cleaner system, python+django+RDKit.
    --> baoilleach #RDKitUGM2018 FOG analyses run every night- Frequency of R groups in Surechembl (Tyrchan JCIM). Uses Fraggle to find parts similar in one area but may be quite different in another.
        --> baoilleach #RDKitUGM2018 Nice to use a knowledge-driven approach to drive forward, rather than a patent-busting one, e.g. MACCS similarity, then ECFP4 similarity. Shoutout to my paper  https://t.co/iaFbaPIr7f
            --> baoilleach #RDKitUGM2018 Shoutout to @iwatobipen 's blog on similarity/shape alignment.

Assessing alignments: ShapeTanimotoDist and ShapeProtrudeDist.
                --> baoilleach @iwatobipen #RDKitUGM2018 Crippen based atom type alignments seems to work quite well, compared to O3A.

Tech note: use Python "futures" to distribute jobs over multiple processors.
                    --> baoilleach @iwatobipen #RDKitUGM2018 Compared different alignment methods from GPCR bench dataset. ROCs vs Align-it, AlignMol, GetO3A, GetCrippenO3A.
                        --> baoilleach @iwatobipen #RDKitUGM2018 Use Django webserver to do the PAINs filters and so forth. Exposed to users through Vortex.

Patent analysis - "fixing" sparse data. Do we see trends for stability in the amines?
                            --> baoilleach @iwatobipen #RDKitUGM2018 Wagener the quest for bioisostereic replacements JCIM. Hussain+Rea MMPs shoutout. mmpdb shoutout from Andrew Dalke.
                                --> baoilleach @iwatobipen #RDKitUGM2018 Shows humongous SQL statement to pull out all transforms from the mmpdb database. Very easy to pull out the transform, the context, cluster them, and show everything.
                                    --> baoilleach @iwatobipen #RDKitUGM2018 Greg has done bioisosteric replacement blogpost Feb 2018.  Rob talking now about haloperidol as example for his implementation of MMPA.
                                        --> baoilleach @iwatobipen #RDKitUGM2018 Matched series analysis. Free-Wilson analysis. R Group decomp, R group relative ranking. What's the penalty going to be associated with a change in R group.

 RSC_CICAG deepScaffOpt. #RDKitUGM2018 https://t.co/cFUjboMo3H

CzodrowskiPaul @tevangelidis gives an update about his scaffold optimization approach taking into account 2D &amp; 3D information at the #RDKitUGM2018 https://t.co/9FKbogFUDS

 RSC_CICAG Next up is Thomas Evangelidis on scaffold optimisation. #RDKitUGM2018 https://t.co/Er8T5vlFkH
-->  RSC_CICAG deepScaffOpt. #RDKitUGM2018 https://t.co/cFUjboMo3H

     jwmay Roger to Greg after Noel’s SMILES talk. #RDKitUGM2018 https://t.co/YN1aMhOtB7
-->    pwk2013 @jwmay Gives a whole new meaning to being rogered

nathanbroon All of the #Chemoinformatics team from @benevolent_ai are at the #RDKitUGM2018. Come and chat to us about open positions. We’re hiring! #CompChem #RealTimeChem

 RSC_CICAG Master of RDKit College, Cambridge. #RDKitUGM2018 https://t.co/vGSJJA1WB0

iwatobipen Datasets and scripts are available at:
https://t.co/PltbUX54BO #RDKitUGM2018

 RSC_CICAG Toolkits with SMILES readers in them. cc @macinchem #RDKitUGM2018 https://t.co/QF7sEYSsk9

CzodrowskiPaul @baoilleach in his talk at the #RDKitUGM2018 "Happy valence models are all alike; every unhappy valence model is unhappy in its own way" (with apologies to Tolstoy)

CzodrowskiPaul @baoilleach reports on benchmarks to standardize SMILES in order to compare the performance between different toolkits at the #RDKitUGM2018

 RSC_CICAG Noel clearly likes his SMILES. Here is his preprint for his DeepSMILES method to use in #MachineLearning: https://t.co/9O76WFLzjr #RDKitUGM2018 cc @baoilleach
-->  RSC_CICAG Toolkits with SMILES readers in them. cc @macinchem #RDKitUGM2018 https://t.co/QF7sEYSsk9

CzodrowskiPaul Beautiful parallel coordinates view within @knime for interactive data crunching presented by @daria_goldmann at #RDKitUGM2018 https://t.co/kboRx2nP30

 RSC_CICAG Next up is Noel O’Boyle (@baoilleach from @nmsoftware) taking on branch marks for reading SMILES. https://t.co/HPO7hzGflq #RDKitUGM2018
-->  RSC_CICAG Noel clearly likes his SMILES. Here is his preprint for his DeepSMILES method to use in #MachineLearning: https://t.co/9O76WFLzjr #RDKitUGM2018 cc @baoilleach
    -->  RSC_CICAG Toolkits with SMILES readers in them. cc @macinchem #RDKitUGM2018 https://t.co/QF7sEYSsk9

 RSC_CICAG Terrific interactive parallel coordinates plots in KNIME. #RDKitUGM2018 https://t.co/L3qRKr3xWz

iwatobipen Interactive patent analysis with KNIME! #RDKitUGM2018 https://t.co/1GVGxdDcWw

 RSC_CICAG Great KNIME workflow for guided analytics with interactive views. #RDKitUGM2018 https://t.co/VK4SPcVShr
-->  RSC_CICAG Terrific interactive parallel coordinates plots in KNIME. #RDKitUGM2018 https://t.co/L3qRKr3xWz

iwatobipen Pat informatics with KNIME! #RDKitUGM2018 https://t.co/rxNcn6ryxa

 RSC_CICAG Great idea from @wpwalters to get #RDKitUGM2018 heads together to build and curate a bioisosteric dataset and release it for the community. This will help benchmark bioisosteric replacement methods.

 RSC_CICAG Next up is Daria Goldman’s from @knime talking on chemistry in KNIME. #RDKitUGM2018 https://t.co/dj3beG6guW
-->  RSC_CICAG Great KNIME workflow for guided analytics with interactive views. #RDKitUGM2018 https://t.co/VK4SPcVShr
    -->  RSC_CICAG Terrific interactive parallel coordinates plots in KNIME. #RDKitUGM2018 https://t.co/L3qRKr3xWz

 RSC_CICAG Comparison of this new implementation with the implementation in SwissBioisostere. #RDKitUGM2018 https://t.co/7osgdu8wbI
-->  RSC_CICAG Great idea from @wpwalters to get #RDKitUGM2018 heads together to build and curate a bioisosteric dataset and release it for the community. This will help benchmark bioisosteric replacement methods.

CzodrowskiPaul Joshua Meyers from @benevolent_ai about R-Group decomposition: Nest Integration into Pandas! #RDKitUGM2018 https://t.co/nyCZJwwhsF

 RSC_CICAG Workflow for R-group decomposition, descriptor generation and bioisosteric replacements #RDKitUGM2018 https://t.co/pTkzWkl0AG
-->  RSC_CICAG Comparison of this new implementation with the implementation in SwissBioisostere. #RDKitUGM2018 https://t.co/7osgdu8wbI
    -->  RSC_CICAG Great idea from @wpwalters to get #RDKitUGM2018 heads together to build and curate a bioisosteric dataset and release it for the community. This will help benchmark bioisosteric replacement methods.

 RSC_CICAG Bioisosteric replacements as metaphored with Game of Thrones. #RDKitUGM2018 https://t.co/VxNQgNYg1x
-->  RSC_CICAG Workflow for R-group decomposition, descriptor generation and bioisosteric replacements #RDKitUGM2018 https://t.co/pTkzWkl0AG
    -->  RSC_CICAG Comparison of this new implementation with the implementation in SwissBioisostere. #RDKitUGM2018 https://t.co/7osgdu8wbI
        -->  RSC_CICAG Great idea from @wpwalters to get #RDKitUGM2018 heads together to build and curate a bioisosteric dataset and release it for the community. This will help benchmark bioisosteric replacement methods.

 RSC_CICAG #RDKitUGM2018 https://t.co/lx8thYDPPC
-->  RSC_CICAG Bioisosteric replacements as metaphored with Game of Thrones. #RDKitUGM2018 https://t.co/VxNQgNYg1x
    -->  RSC_CICAG Workflow for R-group decomposition, descriptor generation and bioisosteric replacements #RDKitUGM2018 https://t.co/pTkzWkl0AG
        -->  RSC_CICAG Comparison of this new implementation with the implementation in SwissBioisostere. #RDKitUGM2018 https://t.co/7osgdu8wbI
            -->  RSC_CICAG Great idea from @wpwalters to get #RDKitUGM2018 heads together to build and curate a bioisosteric dataset and release it for the community. This will help benchmark bioisosteric replacement methods.

iwatobipen https://t.co/4gzxuhBwPU #RDKitUGM2018

 RSC_CICAG Good slide on essence of molecular design! #RDKitUGM2018 https://t.co/VyjqldlWKb
-->  RSC_CICAG #RDKitUGM2018 https://t.co/lx8thYDPPC
    -->  RSC_CICAG Bioisosteric replacements as metaphored with Game of Thrones. #RDKitUGM2018 https://t.co/VxNQgNYg1x
        -->  RSC_CICAG Workflow for R-group decomposition, descriptor generation and bioisosteric replacements #RDKitUGM2018 https://t.co/pTkzWkl0AG
            -->  RSC_CICAG Comparison of this new implementation with the implementation in SwissBioisostere. #RDKitUGM2018 https://t.co/7osgdu8wbI
                -->  RSC_CICAG Great idea from @wpwalters to get #RDKitUGM2018 heads together to build and curate a bioisosteric dataset and release it for the community. This will help benchmark bioisosteric replacement methods.

 RSC_CICAG Starting off today are @mattymattmo &amp; @DrJoshuaBox from @benevolent_ai with a presentation on r-group descriptors in the RDKit. #RDKitUGM2018 https://t.co/LmYjiCaDXa
-->  RSC_CICAG Good slide on essence of molecular design! #RDKitUGM2018 https://t.co/VyjqldlWKb
    -->  RSC_CICAG #RDKitUGM2018 https://t.co/lx8thYDPPC
        -->  RSC_CICAG Bioisosteric replacements as metaphored with Game of Thrones. #RDKitUGM2018 https://t.co/VxNQgNYg1x
            -->  RSC_CICAG Workflow for R-group decomposition, descriptor generation and bioisosteric replacements #RDKitUGM2018 https://t.co/pTkzWkl0AG
                -->  RSC_CICAG Comparison of this new implementation with the implementation in SwissBioisostere. #RDKitUGM2018 https://t.co/7osgdu8wbI
                    -->  RSC_CICAG Great idea from @wpwalters to get #RDKitUGM2018 heads together to build and curate a bioisosteric dataset and release it for the community. This will help benchmark bioisosteric replacement methods.

iwatobipen Rediscovering R-Group Descriptors with RDKit #RDKitUGM2018

 RSC_CICAG Day two begins and a reminder of last night’s formal dinner at Magdalene. #RDKitUGM2018 https://t.co/JZsBee4v85

CzodrowskiPaul 2nd exciting day of the #RDKitUGM2018 https://t.co/cbU2NHylN0

   SiFulle Looking forward to the 2nd Day of the #RDKitUGM2018 Staying in a college over a night is always a classic thing to do in Oxbridge https://t.co/GusQHpLm0j
-->   dagmareh @SiFulle Hi from Bath to Oxbridge!

CzodrowskiPaul #RDKitUGM2018 Good Morning from the RDKitten runners! https://t.co/pmS28A7vs0
-->    axelp69 @CzodrowskiPaul Wow, the dedication....
--> dr_greg_landrum @CzodrowskiPaul Wow, you guys actually went. Nice! I'm sad that I bailed due to the rain. Particularly since the rain then stopped.

mattymattmo Looking forward to speaking tomorrow at the #RDKitUGM2018. It’s a 9am start. Me and @DrJoshuaBox on first 😬 https://t.co/4NG6tqQaQX

CzodrowskiPaul #RDKitUGM2018 No electricity (but eclecticism!) allowed https://t.co/IlGgdbYb50

    hjuinj formal dinner at #RDKitUGM2018 with @baoilleach (as seen in picture) and some other 10 dozen people. https://t.co/5MT6K0uRPn
--> baoilleach @hjuinj With @gp_lau @mattymattmo and more https://t.co/IVm6yqZUbc
    --> baoilleach @hjuinj @gp_lau @mattymattmo https://t.co/Z6ZISxIQjf

CzodrowskiPaul #RDKitUGM2018 Dinner time! https://t.co/0vCFRfyLHC
--> AkiraAiren @CzodrowskiPaul I like the combination of colors in particular... 😂

baoilleach @ChemRxiv My first @ChemRxiv preprint. And perhaps of relevance to #RDKitUGM2018
--> JarvistFrost @baoilleach @ChemRxiv Noel, have you thought about how to make SMILES canonical? (i.e. reorder them into a standardised form.)
Or is that always easier to do by taking them forwards to a molecular graph &amp; then checking for isomorphism?
    --> baoilleach @JarvistFrost @ChemRxiv If I understand correctly, you are not asking about the paper, just about how to generate a canonical SMILES. Like making elephant soup, step one is the hard one: canonicalize the underlying graph. Step two, generating the canonical SMILES, is trivial.
        --> baoilleach @JarvistFrost @ChemRxiv I'm not an expert on graph canonicalisation. My colleague @jwmay knows all the tricks though, and I am waiting for him to write up a paper on the topic (hint hint).
        --> baoilleach @JarvistFrost @ChemRxiv Of course, these algorithms are already in all of the standard toolkits. And you can generally ask for the canonical labels and symmetry classes. e.g. https://t.co/VrCKLNptqA

But I'd actually recommend using randomly ordered SMILES for neural networks:
https://t.co/aoYCmsRZPg
            --> JarvistFrost @baoilleach @ChemRxiv Neural networks scare me, but I'm always interested in representations!
I wonder if a different canonisation procedure could avoid the 'hot' atom at the start.
            --> JarvistFrost @baoilleach @ChemRxiv @jwmay Would definitely like to read it when you write it, @jwmay !
(And hint hint, a blog-post is a good gateway into both disseminating knowledge and writing the full paper :^)

  Piman314 Good day at the #RDKitUGM2018 today, probably not the most exciting use of annual leave ever, but I enjoyed myself!

 RSC_CICAG End of an amazing first day at the #RDKitUGM2018. Now off to the conference dinner at Magdalene College.

iwatobipen https://t.co/0thyiqKWIq rd_filters; useful molecular filter tools #RDKitUGM2018

nathanbroon @RSC_CICAG @wpwalters .@wpwalters’ blog is awesomely awesome: https://t.co/ibVE0UIqoN #RDKitUGM2018

 RSC_CICAG Last, but not least, for today is the awesome @wpwalters on open source stuff built on the #RDKit. #RDKitUGM2018 https://t.co/HAnJN7qB0v
--> nathanbroon @RSC_CICAG @wpwalters .@wpwalters’ blog is awesomely awesome: https://t.co/ibVE0UIqoN #RDKitUGM2018

CzodrowskiPaul #RDKitUGM2018 My personal keynote lecture (by @wpwalters) will start soon "A Few (Hopefully) Interesting Open Source Projects Built On The RDKit"

CzodrowskiPaul #RDKitUGM2018 @SantaFeDoshin Multi-arm bandits to be used for Bayesian optimization https://t.co/VDs427YpgP
--> HessianInf @CzodrowskiPaul @SantaFeDoshin I first gave that talk in Santa Fe.  Glad to see someone else using the idea.
--> HessianInf @CzodrowskiPaul @SantaFeDoshin Well. Maybe just Bruce and I then.

iwatobipen Bayesian Bandit Explorer
https://t.co/n2oQ1HFSkd
#RDKitUGM2018

 RSC_CICAG Roger’s face (@nmsoftware) when the multi-arm bandits appeared was a joyous picture! #RDKitUGM2018 https://t.co/ZV7PCxTFXZ

nathanbroon @friedo91 There have been a few. This one from ICR has a prospective validation: https://t.co/BqrhAZsX1K #RDKitUGM2018
-->   friedo91 @nathanbroon We have also few examples from our lab at #ETHZ. But the number of experimental validations of recent generative models is rather low.
https://t.co/MUDQbuP3ab
    --> nathanbroon @friedo91 I’m talking about de novo design at the GCC in November…
        -->   friedo91 @nathanbroon Unfortunately, I can't make to Mainz in November, but my colleagues will be there ;-) Or @marwinsegler can tell me more about it next time when he will stop in Zurich, on purpose or not :-D
        --> tiago_medchem @nathanbroon @friedo91 great speaker lineup from what I see...

nathanbroon I’m loving the elegant abuse of SMILES syntax to generate new molecular structures. #RDKitUGM2018

nathanbroon De novo design (or generative chemistry) is certainly having its moment in the sun (again!). I think there are four talks on the subject in this #RDKitUGM2018 https://t.co/NOn0NApBfp
-->   friedo91 @nathanbroon Hope to see "real" (synthesized and tested) de novo designs in the near future  ;-)
    --> nathanbroon @friedo91 There have been a few. This one from ICR has a prospective validation: https://t.co/BqrhAZsX1K #RDKitUGM2018
    --> nathanbroon @friedo91 I’ve also used some of these approaches for molecules that are now in the clinic.
        -->   friedo91 @nathanbroon We have also few examples from our lab at #ETHZ. But the number of experimental validations of recent generative models is rather low.
https://t.co/MUDQbuP3ab
            --> nathanbroon @friedo91 I’m talking about de novo design at the GCC in November…
                -->   friedo91 @nathanbroon Unfortunately, I can't make to Mainz in November, but my colleagues will be there ;-) Or @marwinsegler can tell me more about it next time when he will stop in Zurich, on purpose or not :-D
                --> tiago_medchem @nathanbroon @friedo91 great speaker lineup from what I see...

CzodrowskiPaul #RDKitUGM2018 @SantaFeDoshin from DE Shaw: "The most elegant data structures are those that leverage the location of the data os information in itself"

 RSC_CICAG Brian Cole (DESRes) up next talking about de novo design using SMILES. #RDKitUGM2018 https://t.co/ZzP20f3YAr

iwatobipen SMILES-driven de novo design engine #RDKitUGM2018

     jwmay I'll just leave this here: https://t.co/PfRnY3WQ29 #RDKitUGM2018

baoilleach #RDKitUGM2018 Looking into automatic BRO generation (in progress).

When you put in the cell extra enzymes, you can have off-target effects. Can affect other pathways or v.v. The tool will flag up where these might happen

baoilleach #RDKitUGM2018 Once all done, you look for gene knockouts that will push things towards the desired pathway. Want best growth rate also with high yield.
--> baoilleach #RDKitUGM2018 Looking into automatic BRO generation (in progress).

When you put in the cell extra enzymes, you can have off-target effects. Can affect other pathways or v.v. The tool will flag up where these might happen

baoilleach #RDKitUGM2018 Step 1, apply reactions. Next look at mass balance, delta G, and check whether feasible. We use chemical similarity to only choose reactions that move towards the metabolome. But sometimes in practice it does go a bit back, so we use a threshold.
--> baoilleach #RDKitUGM2018 Once all done, you look for gene knockouts that will push things towards the desired pathway. Want best growth rate also with high yield.
    --> baoilleach #RDKitUGM2018 Looking into automatic BRO generation (in progress).

When you put in the cell extra enzymes, you can have off-target effects. Can affect other pathways or v.v. The tool will flag up where these might happen

baoilleach #RDKitUGM2018 We start with the target cmpd, create the rxns possible for that cmpd and keep going until we reach a cmpd that's present in the cell.
--> baoilleach Rxn rules expressed in SMIRKS, biochemical rxn operators (BROs). Also have info in db on cosubstrates and coproducts, as well as EC numbers. But RDKit uses reaction SMARTS. We want to open source our tools, so we want to move to this.
    --> baoilleach #RDKitUGM2018 Step 1, apply reactions. Next look at mass balance, delta G, and check whether feasible. We use chemical similarity to only choose reactions that move towards the metabolome. But sometimes in practice it does go a bit back, so we use a threshold.
        --> baoilleach #RDKitUGM2018 Once all done, you look for gene knockouts that will push things towards the desired pathway. Want best growth rate also with high yield.
            --> baoilleach #RDKitUGM2018 Looking into automatic BRO generation (in progress).

When you put in the cell extra enzymes, you can have off-target effects. Can affect other pathways or v.v. The tool will flag up where these might happen

baoilleach #RDKitUGM2018 We want to move away from petrochemicals, e.g. via cells and biomass. But...target cmpd may have low yield in host or not present at all. So we need synthetic pathway calculation...GEM-Path. A type of retrosynthesis.
--> baoilleach #RDKitUGM2018 We start with the target cmpd, create the rxns possible for that cmpd and keep going until we reach a cmpd that's present in the cell.
    --> baoilleach Rxn rules expressed in SMIRKS, biochemical rxn operators (BROs). Also have info in db on cosubstrates and coproducts, as well as EC numbers. But RDKit uses reaction SMARTS. We want to open source our tools, so we want to move to this.
        --> baoilleach #RDKitUGM2018 Step 1, apply reactions. Next look at mass balance, delta G, and check whether feasible. We use chemical similarity to only choose reactions that move towards the metabolome. But sometimes in practice it does go a bit back, so we use a threshold.
            --> baoilleach #RDKitUGM2018 Once all done, you look for gene knockouts that will push things towards the desired pathway. Want best growth rate also with high yield.
                --> baoilleach #RDKitUGM2018 Looking into automatic BRO generation (in progress).

When you put in the cell extra enzymes, you can have off-target effects. Can affect other pathways or v.v. The tool will flag up where these might happen

baoilleach #RDKitUGM2018 Works in comp chem group, working on GemPath, a genome-scale model based tool for metabolic pathways prediction. Initially developed in Matlab, but now moved to Python.
--> baoilleach #RDKitUGM2018 We want to move away from petrochemicals, e.g. via cells and biomass. But...target cmpd may have low yield in host or not present at all. So we need synthetic pathway calculation...GEM-Path. A type of retrosynthesis.
    --> baoilleach #RDKitUGM2018 We start with the target cmpd, create the rxns possible for that cmpd and keep going until we reach a cmpd that's present in the cell.
        --> baoilleach Rxn rules expressed in SMIRKS, biochemical rxn operators (BROs). Also have info in db on cosubstrates and coproducts, as well as EC numbers. But RDKit uses reaction SMARTS. We want to open source our tools, so we want to move to this.
            --> baoilleach #RDKitUGM2018 Step 1, apply reactions. Next look at mass balance, delta G, and check whether feasible. We use chemical similarity to only choose reactions that move towards the metabolome. But sometimes in practice it does go a bit back, so we use a threshold.
                --> baoilleach #RDKitUGM2018 Once all done, you look for gene knockouts that will push things towards the desired pathway. Want best growth rate also with high yield.
                    --> baoilleach #RDKitUGM2018 Looking into automatic BRO generation (in progress).

When you put in the cell extra enzymes, you can have off-target effects. Can affect other pathways or v.v. The tool will flag up where these might happen

 RSC_CICAG Next up is Marina Fedorova from Technical University of Denmark talking on prediction of metabolic pathways. #RDKitUGM2018 https://t.co/oUVkfslWSi

baoilleach #RDKitUGM2018 Marina Fedorova from DTU on Computational tools for metabolic pathways pred
--> baoilleach #RDKitUGM2018 Works in comp chem group, working on GemPath, a genome-scale model based tool for metabolic pathways prediction. Initially developed in Matlab, but now moved to Python.
    --> baoilleach #RDKitUGM2018 We want to move away from petrochemicals, e.g. via cells and biomass. But...target cmpd may have low yield in host or not present at all. So we need synthetic pathway calculation...GEM-Path. A type of retrosynthesis.
        --> baoilleach #RDKitUGM2018 We start with the target cmpd, create the rxns possible for that cmpd and keep going until we reach a cmpd that's present in the cell.
            --> baoilleach Rxn rules expressed in SMIRKS, biochemical rxn operators (BROs). Also have info in db on cosubstrates and coproducts, as well as EC numbers. But RDKit uses reaction SMARTS. We want to open source our tools, so we want to move to this.
                --> baoilleach #RDKitUGM2018 Step 1, apply reactions. Next look at mass balance, delta G, and check whether feasible. We use chemical similarity to only choose reactions that move towards the metabolome. But sometimes in practice it does go a bit back, so we use a threshold.
                    --> baoilleach #RDKitUGM2018 Once all done, you look for gene knockouts that will push things towards the desired pathway. Want best growth rate also with high yield.
                        --> baoilleach #RDKitUGM2018 Looking into automatic BRO generation (in progress).

When you put in the cell extra enzymes, you can have off-target effects. Can affect other pathways or v.v. The tool will flag up where these might happen

baoilleach @Piman314 #RDKitUGM2018 But the probs are not independent are some molecules are v. similar. Incorporate mol similarity to account for this (details not shown).

 RSC_CICAG @macinchem @Piman314 https://t.co/SrxuHDz7Ou #RDKitUGM2018

baoilleach @Piman314 #RDKitUGM2018 Moving onto cmpd selection. We want to choose the best combination of mols to opt our chances of success. "We use a little bit of optimization theory". What's the prob of this mol matching our criteria giving these model scores and the error rates of those models.
--> baoilleach @Piman314 #RDKitUGM2018 But the probs are not independent are some molecules are v. similar. Incorporate mol similarity to account for this (details not shown).

baoilleach @Piman314 #RDKitUGM2018 ten iterations, keep top 100 mols at each iteration, 10000 replacements.
--> baoilleach @Piman314 #RDKitUGM2018 Moving onto cmpd selection. We want to choose the best combination of mols to opt our chances of success. "We use a little bit of optimization theory". What's the prob of this mol matching our criteria giving these model scores and the error rates of those models.
    --> baoilleach @Piman314 #RDKitUGM2018 But the probs are not independent are some molecules are v. similar. Incorporate mol similarity to account for this (details not shown).

baoilleach @Piman314 #RDKitUGM2018 References the MOARF paper. Fragmentation algorithm described in paper. Replace a fragment, then filter, scoring with autodock/qsar. Iterate multiple times.
--> baoilleach @Piman314 #RDKitUGM2018 ten iterations, keep top 100 mols at each iteration, 10000 replacements.
    --> baoilleach @Piman314 #RDKitUGM2018 Moving onto cmpd selection. We want to choose the best combination of mols to opt our chances of success. "We use a little bit of optimization theory". What's the prob of this mol matching our criteria giving these model scores and the error rates of those models.
        --> baoilleach @Piman314 #RDKitUGM2018 But the probs are not independent are some molecules are v. similar. Incorporate mol similarity to account for this (details not shown).

nathanbroon .@Piman314 stole my slide! #RDKitUGM2018 https://t.co/3M9BIlNTA3

baoilleach @Piman314 #RDKitUGM2018 De novo design (or generative chemistry) is an efficient method of searching chemistry space to satisfy multiple criteria
--> baoilleach @Piman314 #RDKitUGM2018 References the MOARF paper. Fragmentation algorithm described in paper. Replace a fragment, then filter, scoring with autodock/qsar. Iterate multiple times.
    --> baoilleach @Piman314 #RDKitUGM2018 ten iterations, keep top 100 mols at each iteration, 10000 replacements.
        --> baoilleach @Piman314 #RDKitUGM2018 Moving onto cmpd selection. We want to choose the best combination of mols to opt our chances of success. "We use a little bit of optimization theory". What's the prob of this mol matching our criteria giving these model scores and the error rates of those models.
            --> baoilleach @Piman314 #RDKitUGM2018 But the probs are not independent are some molecules are v. similar. Incorporate mol similarity to account for this (details not shown).

baoilleach @Piman314 #RDKitUGM2018 (ed: I wonder if https://t.co/2sKFF64JOz is relevant to the problems found with MAE)
--> baoilleach @Piman314 #RDKitUGM2018 De novo design (or generative chemistry) is an efficient method of searching chemistry space to satisfy multiple criteria
    --> baoilleach @Piman314 #RDKitUGM2018 References the MOARF paper. Fragmentation algorithm described in paper. Replace a fragment, then filter, scoring with autodock/qsar. Iterate multiple times.
        --> baoilleach @Piman314 #RDKitUGM2018 ten iterations, keep top 100 mols at each iteration, 10000 replacements.
            --> baoilleach @Piman314 #RDKitUGM2018 Moving onto cmpd selection. We want to choose the best combination of mols to opt our chances of success. "We use a little bit of optimization theory". What's the prob of this mol matching our criteria giving these model scores and the error rates of those models.
                --> baoilleach @Piman314 #RDKitUGM2018 But the probs are not independent are some molecules are v. similar. Incorporate mol similarity to account for this (details not shown).

CzodrowskiPaul #RDKitUGM2018  Nic @Piman314  "If you don't have very active data, use a less compex model"

baoilleach @Piman314 #RDKitUGM2018 Random than using a random train/test splits we formalized the loss function and the training/test splits.
--> baoilleach @Piman314 #RDKitUGM2018 (ed: I wonder if https://t.co/2sKFF64JOz is relevant to the problems found with MAE)
    --> baoilleach @Piman314 #RDKitUGM2018 De novo design (or generative chemistry) is an efficient method of searching chemistry space to satisfy multiple criteria
        --> baoilleach @Piman314 #RDKitUGM2018 References the MOARF paper. Fragmentation algorithm described in paper. Replace a fragment, then filter, scoring with autodock/qsar. Iterate multiple times.
            --> baoilleach @Piman314 #RDKitUGM2018 ten iterations, keep top 100 mols at each iteration, 10000 replacements.
                --> baoilleach @Piman314 #RDKitUGM2018 Moving onto cmpd selection. We want to choose the best combination of mols to opt our chances of success. "We use a little bit of optimization theory". What's the prob of this mol matching our criteria giving these model scores and the error rates of those models.
                    --> baoilleach @Piman314 #RDKitUGM2018 But the probs are not independent are some molecules are v. similar. Incorporate mol similarity to account for this (details not shown).

baoilleach @Piman314 #RDKitUGM2018 450K simulations. Overall RF and sim searching not much better than random in this process. Cannot extrapolate v well. Ridge regression much better (and often much faster).

baoilleach @Piman314 #RDKitUGM2018 (Just published in JCIM) Try to set up a training/test set with data similar to what you would have in an early stage drug discov project. Simulte the iterative process. Train model on training set - predict on test - select.
--> baoilleach @Piman314 #RDKitUGM2018 450K simulations. Overall RF and sim searching not much better than random in this process. Cannot extrapolate v well. Ridge regression much better (and often much faster).

baoilleach @Piman314 #RDKitUGM2018 We frequently want to make preds outside of the applicability domain. E.g. just starting a project. Some algos better than others for extrapolation. Which should we be using in drug discovery?
--> baoilleach @Piman314 #RDKitUGM2018 (Just published in JCIM) Try to set up a training/test set with data similar to what you would have in an early stage drug discov project. Simulte the iterative process. Train model on training set - predict on test - select.
    --> baoilleach @Piman314 #RDKitUGM2018 450K simulations. Overall RF and sim searching not much better than random in this process. Cannot extrapolate v well. Ridge regression much better (and often much faster).

 RSC_CICAG Last minute change of title for @Piman314. #RDKitUGM2018 https://t.co/PCHSxd0Pqg
-->  macinchem @RSC_CICAG @Piman314 Is Evariste a new company?
    -->  RSC_CICAG @macinchem @Piman314 About a year old.
    -->  RSC_CICAG @macinchem @Piman314 https://t.co/SrxuHDz7Ou #RDKitUGM2018

baoilleach @Piman314 #RDKitUGM2018 Company founded by Oliver Watson (from quant background) doing WDAR modelling, de novo design and selecting optimal subsets.
--> baoilleach @Piman314 #RDKitUGM2018 We frequently want to make preds outside of the applicability domain. E.g. just starting a project. Some algos better than others for extrapolation. Which should we be using in drug discovery?
    --> baoilleach @Piman314 #RDKitUGM2018 (Just published in JCIM) Try to set up a training/test set with data similar to what you would have in an early stage drug discov project. Simulte the iterative process. Train model on training set - predict on test - select.
        --> baoilleach @Piman314 #RDKitUGM2018 450K simulations. Overall RF and sim searching not much better than random in this process. Cannot extrapolate v well. Ridge regression much better (and often much faster).

 RSC_CICAG First after the break is @Piman314 talking about multiparameter optimisation. #RDKitUGM2018 https://t.co/UI3fpimnRh
-->  RSC_CICAG Last minute change of title for @Piman314. #RDKitUGM2018 https://t.co/PCHSxd0Pqg
    -->  macinchem @RSC_CICAG @Piman314 Is Evariste a new company?
        -->  RSC_CICAG @macinchem @Piman314 About a year old.
        -->  RSC_CICAG @macinchem @Piman314 https://t.co/SrxuHDz7Ou #RDKitUGM2018

baoilleach #RDKitUGM2018 Nicholas Firth, CSO of Evariste Technologies (and @Piman314) on multiparameter opt using RDKit and scipy
--> baoilleach @Piman314 #RDKitUGM2018 Company founded by Oliver Watson (from quant background) doing WDAR modelling, de novo design and selecting optimal subsets.
--> baoilleach @Piman314 #RDKitUGM2018 Random than using a random train/test splits we formalized the loss function and the training/test splits.
    --> baoilleach @Piman314 #RDKitUGM2018 (ed: I wonder if https://t.co/2sKFF64JOz is relevant to the problems found with MAE)
        --> baoilleach @Piman314 #RDKitUGM2018 De novo design (or generative chemistry) is an efficient method of searching chemistry space to satisfy multiple criteria
            --> baoilleach @Piman314 #RDKitUGM2018 References the MOARF paper. Fragmentation algorithm described in paper. Replace a fragment, then filter, scoring with autodock/qsar. Iterate multiple times.
                --> baoilleach @Piman314 #RDKitUGM2018 ten iterations, keep top 100 mols at each iteration, 10000 replacements.
                    --> baoilleach @Piman314 #RDKitUGM2018 Moving onto cmpd selection. We want to choose the best combination of mols to opt our chances of success. "We use a little bit of optimization theory". What's the prob of this mol matching our criteria giving these model scores and the error rates of those models.
                        --> baoilleach @Piman314 #RDKitUGM2018 But the probs are not independent are some molecules are v. similar. Incorporate mol similarity to account for this (details not shown).
    --> baoilleach @Piman314 #RDKitUGM2018 We frequently want to make preds outside of the applicability domain. E.g. just starting a project. Some algos better than others for extrapolation. Which should we be using in drug discovery?
        --> baoilleach @Piman314 #RDKitUGM2018 (Just published in JCIM) Try to set up a training/test set with data similar to what you would have in an early stage drug discov project. Simulte the iterative process. Train model on training set - predict on test - select.
            --> baoilleach @Piman314 #RDKitUGM2018 450K simulations. Overall RF and sim searching not much better than random in this process. Cannot extrapolate v well. Ridge regression much better (and often much faster).

 RSC_CICAG Astonishing number of open positions in #Chemoinformatics #Chemonformatics at the moment: more than a dozen jobs. Really telling that the field is truly having a Renaissance. #RDKitUGM2018

CzodrowskiPaul Boran Adas https://t.co/SECO4KWQfN about his @gsoc project about fingerprint generation #RDKitUGM2018
(including a variance filter which I find highly interesting) https://t.co/WH18nayVtK

 RSC_CICAG Really excited to see this work on fingerprint generators in #RDKit from Boran Adas during his #GSoC placement. #RDKitUGM2018 @gsoc https://t.co/a8rlKPFRxg

CzodrowskiPaul Tim Dudgeon on his fragment analysis based on work from Astex https://t.co/u2iAe42Cc0 #RDKitUGM2018 https://t.co/0qVm5BPCgl

 RSC_CICAG Next up is Susan Leung from Oxford talking about the integration of #MolVS to #RDKit. #RDKitUGM2018 https://t.co/ZX0jsUZHRH

CzodrowskiPaul Paolo Tosco from @cressetgroup presents a tight integration of @RDKit_org in their tool Flare #RDKitUGM2018 https://t.co/1llL0EKnnT

benevolent_ai Catch us at the 7th @RDKit_org UGM 2018 event in Cambridge tomorrow presenting on methods for bioisosteric replacement #RDKitUGM2018. More details on the event here:  https://t.co/gUHDrPsCv4

CzodrowskiPaul Alpha Lee https://t.co/m4zbwJoJM1  on stage and presents his reaction prediction (will be open source in a few weeks time) #RDKitUGM2018 https://t.co/2dagr8wcLm

 RSC_CICAG Looks like the work of Joseph Wright of Derby. #RDKitUGM2018 https://t.co/RsjivXwBFq

 RSC_CICAG Last talk of the morning from Alpha Lee at Cambridge on uncertainty in molecular #DeepLearning. #RDKitUGM2018 https://t.co/NeAgiUwzxU
-->  RSC_CICAG Looks like the work of Joseph Wright of Derby. #RDKitUGM2018 https://t.co/RsjivXwBFq

 RSC_CICAG Next up is Sereina Riniker from the ETH on the 3D conformer generator in the RDKit. Sereina admits she is bad at coming up with acronyms, ergo ETKDG… (experimental torsion knowledge distance geometry) #RDKitUGM2018 https://t.co/S1YL1WUs3Q

CzodrowskiPaul Impressive speed of similarity search, presented by Pat Lordon @Schrodinger #RDKitUGM2018 https://t.co/9Vgvpfob2R

 RSC_CICAG Next talk: GPUSimilarity: similarity searching a billion compounds in real-time by Pat Lorton at Schrödinger. #RDKitUGM2018 https://t.co/xGeIL0ICRu
-->  RSC_CICAG Code base is here: https://t.co/weU50Gs6F4

iwatobipen Next, GPUSimilarity: #RDKitUGM2018
--> iwatobipen https://t.co/Q2C4ZVWVy3

iwatobipen Code example #RDKitUGM2018
 https://t.co/7pc8WgDHg6 https://t.co/h5joQlinWU

 RSC_CICAG @nathanbroon Here is the link for EyeMol: https://t.co/ugDEjCJZo6 #RDKitUGM2018

nathanbroon EyeMol for interactive dataset management from Christophe Molina. #RDKitUGM2018 https://t.co/DSMDYp7jZb
-->  RSC_CICAG @nathanbroon Here is the link for EyeMol: https://t.co/ugDEjCJZo6 #RDKitUGM2018

 RSC_CICAG Second poster on OpenRiskNet. #RDKitUGM2018 https://t.co/CGHlh17HTI

 RSC_CICAG Two posters from Tim Dudgeon on Squonk using the RDKit. #RDKitUGM2018 https://t.co/9WQWUuKfWx
-->  RSC_CICAG Second poster on OpenRiskNet. #RDKitUGM2018 https://t.co/CGHlh17HTI

 RSC_CICAG Second poster is from Connor at Heptares on generation of 2D coordinates for complex peptides using HELM format. #RDKitUGM2018

 RSC_CICAG First poster at #RDKitUGM2018 https://t.co/x7Ipr4uChT

 RSC_CICAG Now onto flash poster presentations. #RDKitUGM2018

 RSC_CICAG Amazing interactive interrogation of fingerprint bits set and the fragments that set them with colour coding of contribution. Really cool and very powerful feature. #RDKitUGM2018

CzodrowskiPaul #RDKitUGM2018    New fingerprint visualizer in @RDKit_org by @foo_fighterin - looks really cool: https://t.co/7WEPYVuajf

 RSC_CICAG All the #RDKitUGM2018 content will be posted here during and after the meeting: https://t.co/Z9A1hGzfHP

iwatobipen DrawMorganBit function is cool! Visualize bit information #RDKitUGM2018

 RSC_CICAG Great improvements to the RDKit being explained: MolVS integration, general fingerprinting interface and 2D coordinate generator. #RDKitUGM2018

iwatobipen Coordgen
https://t.co/5S2rwSm8S4 #RDKitUGM2018

 macinchem I'll keeping an eye out for the tweets. #RDKitUGM2018 https://t.co/Vy2g5IeAVi

 RSC_CICAG CICAG are at the #RDKitUGM2018 and will be live tweeting for the next couple of days. https://t.co/VE4JL0qeKe

nathanbroon Start of the #RDKitUGM2018. Who is here and tweeting? @dr_greg_landrum @RDKit_org @iwatobipen @jwmay @DrJoshuaBox @wojcikowskim @AndreasBenderUK @CzodrowskiPaul @wpwalters @baoilleach @mattymattmo @mattswain123 @RichardSherhod https://t.co/L0AFlSgq94
--> nathanbroon And I forgot @Piman314…
--> mattswain123 @nathanbroon @dr_greg_landrum @RDKit_org @iwatobipen @jwmay @DrJoshuaBox @wojcikowskim @AndreasBenderUK @CzodrowskiPaul @wpwalters @baoilleach @mattymattmo @RichardSherhod Here!
-->  paulcoxon @nathanbroon @dr_greg_landrum @RDKit_org @iwatobipen @jwmay @DrJoshuaBox @wojcikowskim @AndreasBenderUK @CzodrowskiPaul @wpwalters @baoilleach @mattymattmo @mattswain123 @RichardSherhod Ah the Squibb, the most uncomfortable lecture benches in the entire uni 😖
-->  wpwalters @nathanbroon @dr_greg_landrum @RDKit_org @iwatobipen @jwmay @DrJoshuaBox @wojcikowskim @AndreasBenderUK @CzodrowskiPaul @baoilleach @mattymattmo @mattswain123 @RichardSherhod present
--> baoilleach @nathanbroon @dr_greg_landrum @RDKit_org @iwatobipen @jwmay @DrJoshuaBox @wojcikowskim @AndreasBenderUK @CzodrowskiPaul @wpwalters @mattymattmo @mattswain123 @RichardSherhod Yes but electricity-depleted. I keep charging in sockets that are not plugged in. Doh!
    --> nathanbroon @baoilleach @dr_greg_landrum @RDKit_org @iwatobipen @jwmay @DrJoshuaBox @wojcikowskim @AndreasBenderUK @CzodrowskiPaul @wpwalters @mattymattmo @mattswain123 @RichardSherhod I have a high capacity power bank if needed. #alwaysprepared https://t.co/VwujfB71II
        --> CzodrowskiPaul @nathanbroon @baoilleach @dr_greg_landrum @RDKit_org @iwatobipen @jwmay @DrJoshuaBox @wojcikowskim @AndreasBenderUK @wpwalters @mattymattmo @mattswain123 @RichardSherhod I still vote +1 for the chalk board!
.. no electricity needed ..
            --> nathanbroon @CzodrowskiPaul @baoilleach @dr_greg_landrum @RDKit_org @iwatobipen @jwmay @DrJoshuaBox @wojcikowskim @AndreasBenderUK @wpwalters @mattymattmo @mattswain123 @RichardSherhod I can tell you’re a Professor! 👨‍🏫
            -->  RDKit_org @CzodrowskiPaul @nathanbroon @baoilleach @dr_greg_landrum @iwatobipen @jwmay @DrJoshuaBox @wojcikowskim @AndreasBenderUK @wpwalters @mattymattmo @mattswain123 @RichardSherhod Notice that @CzodrowskiPaul did not, however, actually use the chalk board during *his* lightning talk, so you maybe should dismiss his comments here.
                --> CzodrowskiPaul @RDKit_org @nathanbroon @baoilleach @dr_greg_landrum @iwatobipen @jwmay @DrJoshuaBox @wojcikowskim @AndreasBenderUK @wpwalters @mattymattmo @mattswain123 @RichardSherhod Thanks for your comment (totally forgot to use the board), will no longer rant on this and be quiet &amp; modest
    --> nathanbroon @wpwalters @dr_greg_landrum @RDKit_org @iwatobipen @jwmay @DrJoshuaBox @wojcikowskim @AndreasBenderUK @CzodrowskiPaul @baoilleach @mattymattmo @mattswain123 @RichardSherhod ✅
    --> CzodrowskiPaul @nathanbroon @Piman314 Will there be a Tweet-Up in one of the breaks?
Are you volunteering, @nathanbroon ?
    --> nathanbroon And @foo_fighterin
        --> nathanbroon And @hjuinj
            --> nathanbroon And @SiFulle
        --> nathanbroon @CzodrowskiPaul @Piman314 In one of the pubs? 🍻

CzodrowskiPaul #RDKitUGM2018: we are getting started https://t.co/u1muJrC3nj

dr_greg_landrum The room is about to start filling up...
#RDKitUGM2018 https://t.co/Cxn8c8eE8V

iwatobipen Ready! #RDKitUGM2018 https://t.co/VO11ha0Ber

CzodrowskiPaul For me, I‘m soon ready for the #RDKitUGM2018 https://t.co/dZVNVoS8VQ

nathanbroon Arrived in Cambridge for the #RDKitUGM2018. Looking forward to the next few days: lots of great science and conversations with the coolest geeks! 🤓
-->   mjmralph @nathanbroon Ooooh. I love me a geek - find me a sexy single one?
-->  paulcoxon @nathanbroon that's a pretty long hashtag there
    --> nathanbroon @paulcoxon Tell me about it!
        -->  paulcoxon @nathanbroon What does it meeeean?
            --> nathanbroon @paulcoxon 2018 User Group Meeting for the @RDKit_org https://t.co/9XvscL7la1
                -->  paulcoxon @nathanbroon @RDKit_org Sounds fun
                    --> nathanbroon @paulcoxon @RDKit_org It is!

dr_greg_landrum @wojcikowskim We will be meeting at 18:00 at the College Bar in King's College for beers. Enter via Porters Lodge: https://t.co/gz4lQOfCRV
#RDKitUGM2018
--> CzodrowskiPaul @dr_greg_landrum @wojcikowskim Will you send aroung a tweet once you change the place?
Or will you stay there all over the evening?
    --> dr_greg_landrum @CzodrowskiPaul @wojcikowskim Yeah, I will tweet any change of location that I'm involved in.
        --> baoilleach @dr_greg_landrum @CzodrowskiPaul @wojcikowskim Might be worth emailing around about the drinks tonight. We've heard from another attendee who wasn't aware.
            --> wojcikowskim @baoilleach @dr_greg_landrum @CzodrowskiPaul I think it was mentioned one of the gigantic emails from Andreas ;-)
                --> baoilleach @wojcikowskim @dr_greg_landrum @CzodrowskiPaul Oh yes - I know it was, but I think that others have missed it.
                    --> dr_greg_landrum @baoilleach @wojcikowskim @CzodrowskiPaul yep, great point. I just sent the mass email.

wojcikowskim On my way to #RDKitUGM2018 Hopefully my flight will not be delayed any longer... Beer tonight?
--> dr_greg_landrum @wojcikowskim We will be meeting at 18:00 at the College Bar in King's College for beers. Enter via Porters Lodge: https://t.co/gz4lQOfCRV
#RDKitUGM2018
    --> CzodrowskiPaul @dr_greg_landrum @wojcikowskim Will you send aroung a tweet once you change the place?
Or will you stay there all over the evening?
        --> dr_greg_landrum @CzodrowskiPaul @wojcikowskim Yeah, I will tweet any change of location that I'm involved in.
            --> baoilleach @dr_greg_landrum @CzodrowskiPaul @wojcikowskim Might be worth emailing around about the drinks tonight. We've heard from another attendee who wasn't aware.
                --> wojcikowskim @baoilleach @dr_greg_landrum @CzodrowskiPaul I think it was mentioned one of the gigantic emails from Andreas ;-)
                    --> baoilleach @wojcikowskim @dr_greg_landrum @CzodrowskiPaul Oh yes - I know it was, but I think that others have missed it.
                        --> dr_greg_landrum @baoilleach @wojcikowskim @CzodrowskiPaul yep, great point. I just sent the mass email.

CzodrowskiPaul New to me Python 3.6 (not so much @RDKit_org #RDKitUGM2018 -related), but very helpful in general:
f-string formatting
e.g.  f"pH {pH} is {label}"

dr_greg_landrum Just in time production of the material for today’s @KNIME @RDKit_org training. #RDKitUGM2018 https://t.co/Ht3ht9V7ZV

CzodrowskiPaul Back to school for me - just joined Andrew Dalke’s powerful Python training in Cambridge #RDKitUGM2018 https://t.co/LcEZysTc4k

dr_greg_landrum At LGW and starting the train ride to Cambridge soon. #travelfun #RDKitUGM2018
--> georgeisyourman @dr_greg_landrum All the best for the ugm!
-->    axelp69 @dr_greg_landrum @CzodrowskiPaul , @nathanbroon Have fun, all of you!

markussitzmann @nathanbroon @nathanbroon, please keep the numbers of food photos low @hashtag #RDKitUGM2018 😀
--> nathanbroon @markussitzmann I’ll see what I can do!

CzodrowskiPaul Cool, the first notebook for the upcoming #RDKitUGM2018 has already been placed on github: https://t.co/vCCwG9RRbc

Kudos to @dr_greg_landrum !
--> dr_greg_landrum @CzodrowskiPaul yeah, uh, that got pushed a bit early. It's not quite done yet. :-)
--> nathanbroon @CzodrowskiPaul @dr_greg_landrum Will we be seeing you in Cambridge?
    --> baoilleach @nathanbroon @CzodrowskiPaul @dr_greg_landrum Royal plural?
    --> CzodrowskiPaul @nathanbroon @dr_greg_landrum absolutely!
        --> nathanbroon @baoilleach @CzodrowskiPaul @dr_greg_landrum I was speaking on behalf of Greg as well 😜
            --> baoilleach @nathanbroon @CzodrowskiPaul @dr_greg_landrum I bet that's HRM's excuse also.
                --> nathanbroon @baoilleach @CzodrowskiPaul @dr_greg_landrum https://t.co/iKM2OyXcQj
    --> wojcikowskim @dr_greg_landrum @CzodrowskiPaul We hope so, there has been so much things since last UGM!
        --> wojcikowskim @dr_greg_landrum @CzodrowskiPaul Crap, *many* things... Improved performance is one of the biggest ones!

dr_greg_landrum @rguha @nathanbroon As in past years, I'll put all the slides I get in github: https://t.co/52qE1UvuMl
#RDKitUGM2018

dr_greg_landrum @nathanbroon We're going with #RDKitUGM2018 in order to be consistent with past years. And because we've got so many characters available now!
--> nathanbroon @dr_greg_landrum Well done on digging out these tweets!
    --> dr_greg_landrum @nathanbroon ohmygod, I didn't even notice that that was from 2012.
I wasn't even on twitter at that point.
        --> nathanbroon @dr_greg_landrum Perhaps you deserve a ramen-based prize!
            --> dr_greg_landrum @nathanbroon I do! I do!

dr_greg_landrum Almost there... The pre-UGM training starts today, continues through tomorrow, and the UGM itself starts on Wednesday. I'm really looking forward to seeing everyone! #RDKitUGM2018