Thứ Hai, 2 tháng 10, 2017

Waching daily Oct 2 2017

No more fees for Coinbase Account, How to use GDAX Tutorial

For more infomation >> No more fees for Coinbase Account, How to Use GDAX, Tutorial Part 2 - Duration: 7:52.

-------------------------------------------

'An Act Of Pure Evil': Trump Calls For Unity After Deadly Las Vegas Shooting - Duration: 1:28.

For more infomation >> 'An Act Of Pure Evil': Trump Calls For Unity After Deadly Las Vegas Shooting - Duration: 1:28.

-------------------------------------------

ONE simple SECRET for better VLOGS / FILMS... - Duration: 3:04.

what is up guys and welcome to my filmmaking vlog where I try to learn how

to make better films and hopefully if you're interested you can learn

something in the process as well... last week we were talking about this little clip I

made on this boat on this really small space and how t o make a little cool edit

even if you're not going to different locations and whatnot and

today we're talking about a little bit of the same topic and this about the

second clip from the same trip to the countryside that I made last weekend

yeah before we go into that clip and talk about it let's just watch it so

here it goes three two one go

and we're back so did you enjoy it did you like the clip was it okay like

7 minus? 7 plus? something like that personally for me when I was making that

vlog what was going in my head is that we need to tell a story we need to start

from one place and end up in another basically we weren't again going to any

cool locations we didn't have any activity in the vlog but then I

realized we can start from this forest and we can end up at the beach or at the

at this lake...somehow that makes the vlog even better so that was what I was

going on in my head... no matter if you shooting

if you're shooting anything like that it really helps to start one place and end

up in another...start with the problem and solve that problem...somehow tell a little

story doesn't need to be fancy storyline or dialogue or any fancy action or

activities but it really helps if you try to tell even a little bit of a story

yeah I think that's it for today's vlog... and if you have someone in your family

who would like to learn how to make films how to make videos take photos

share its video with him with her and yeah this vlog is not about those

YouTube is full of those you can go and watch them learned it from there no

matter if you're shooting with an iPhone DSLR this is all about learning how to

make better documentary travel vlog kind of films but yeah I'm rambling here

thank you so much for watching if you enjoyed this video please hit the

subscribe button hit the like button and I'll see you in the next vlog all right

bye

For more infomation >> ONE simple SECRET for better VLOGS / FILMS... - Duration: 3:04.

-------------------------------------------

Tips for Your LinkedIn Profile From a Brand Strategist | Full Sail University - Duration: 3:31.

(inspiring music)

- I want a profile photo that shows me

who you are right now so that when I see that,

not if, because now a days I will look you up

on Google before I meet you, not if but when I look you up

when I consume your online brand and

then I meet you in person if those two experiences

are consistent that's a very

positive experience for me as a fellow human being.

Alternatively, if you build a brand that is not accurate

to who you are in real life, let me use dating as an example

online dating, right, if you build a profile of about

who you wish you were but aren't actually, right,

and then I see you and I'm like oh we got a hottie.

Boom, let's meet up, let's hang out.

Right, go on a first date and you are not

how you've betrayed yourself we have a problem.

That's a very negative reaction that I'm going to have

to you and that's the dating world

but it's every other world as well.

So that's why I say it and I mean it.

It is your responsibility to accurately betray

who you are in real life online.

So your profile photo on Linkedin and

on every platform needs to be great.

Copy, I have some opinions on this that are not

necessarily the law but I love for copy on your website

and on your Linkedin profiles to be written in first person.

Again, there are people who will argue with me on that

and prefer it in third person but I just think like

it's 2017, we know you're writing your own Linkedin profile.

So when it's written in third person it establishes distance

that's just not necessary, you know.

I like when things are a bit more direct.

So I tend to write copies for myself

and for my clients in first person.

I do this, this is who I am, I think people

connect with it better, it feels more conversational.

And again, just make sure you are including

relevant key words to your brand.

What are you capable of doing, what have you done before,

what are people typing into Google and into Linkedin

to find talent, for jobs, for gigs.

Make a whole list of what those words are and audit your

profile to make sure you've included them.

You've included them as keywords, they're endorsements.

My next thing, you know, to make sure that you getting

endorsed for these things, Linkedin has this built in

feature where you can endorse others

with testimonials but also skills.

The more built up those are, the more established you look

and so that's your goal, to get to at least 500 contacts

on Linkedin because it says 500 plus and

to get as many endorsements and skills endorsed as possible

because it just makes you look like more of an expert.

And it's pretty simple to do, particularly with classmates

for example, ask them to endorse you,

you endorse them, help each other out.

For more infomation >> Tips for Your LinkedIn Profile From a Brand Strategist | Full Sail University - Duration: 3:31.

-------------------------------------------

Easy Chocolate Cupcake Recipes For Beginners. Cupcake - Duration: 6:20.

Easy Chocolate Cupcake Recipes For Beginners

For more infomation >> Easy Chocolate Cupcake Recipes For Beginners. Cupcake - Duration: 6:20.

-------------------------------------------

Top 3 Social Media Platforms for My Brand - Phil Pallen | Full Sail University - Duration: 6:23.

(upbeat music)

- The next question I usually get is,

"OK, well I know I need to be awesome at three platforms.

"What should they be?"

Actually no, the question that I get the most often is,

"What are your top three platforms?"

So I figured, just to save someone asking that question

in question period, that I would tell you right now.

What are Phil's top three platforms?

Not necessarily what your three should be,

but they'll start to give you ideas

on how I've prioritized these.

Number one for me, Twitter.

Oh, someone just cheered. We're instantly BFFs.

Oh yeah. We already are BFFs.

Yeah, so Twitter is my number one platform

because it's quick, it's easy, as some of you know

it's the easiest way to get in touch with me.

For busy people, but in fact I still believe that

this is the most important social media platform

to access a targeted audience for free.

You can, if you use this platform strategically,

access a very targeted audience of people

that are interested in what you're doing.

Without having to spend money on advertising,

like you'd have to on Facebook and LinkedIn

and many other platforms where you can advertise.

You can still advertise on Twitter,

but you don't necessarily need to, depending on your goals.

If you're strategic, and I'll actually give you

some framework today on how to write those tweets,

so that you have no more excuses

when I say, "Why aren't you on Twitter?"

"I don't know what to Tweet." Yes you do.

Replay this video a few times,

and you'll get really good at it.

So Twitter for me is number one.

Number two, a platform that doesn't get as much love

as others, but in fact is one of my,

one of the most important platforms for me,

one of the most important platforms for retailers,

in terms of converting sales, Pinterest.

And this is really my best example of prioritizing

your platforms, and giving them purpose.

I'll talk about that coming up.

But I used to use Pinterest to document things that

had absolutely no relevance to my brand, like cute outfits.

(audience laughs)

Recipes that I should make.

Travel destinations that I wanted to go.

And then I was like, "I don't really

"know what I'm doing on Pinterest.

"I don't have a clear purpose."

And then I thought well, what advice

would I give to someone else?

What advice would I give to a client

to use this platform effectively?

And I realized that I would have said,

"Give it purpose, establish a framework

"to motivate you to update it, and also give your audience

"some expectation of what they're gonna get from you."

And I don't think, as lean and mean my physique really is,

I'm pretty sure people aren't coming to me

for workout or diet tips, OK? (laughs)

They're coming to me to know good examples of branding.

So why wasn't I pinning that?

I don't know the answer to that question. (laughs)

So I reworked what I was doing, and I decided,

"I'm gonna use Pinterest to document

"great examples of branding."

Business cards, web sites, logos.

I have over 20,000 pins, because I dedicate

5 minutes a day to pinning, and I've gone from

like 800 followers, when I first started, to over 20,000.

And that's in the course of about a year.

So that's the power of giving your

platforms purpose, prioritizing them.

I'll talk a little bit coming up

about giving each platform purpose,

but I'm just kinda previewing that idea now.

My third one I think is Instagram,

which I have a love/hate relationship with.

Shout out to Sarah, who I just gave my Instagram

password to, I'm like, "Can you just post for me?" (laughs)

'Cause you took a great photo of me earlier.

So Instagram is stressful to me, and probably

for some of you, unless you're a photographer.

I can do a lot of things well, but

driving and photographing things are

two things I do not do well. (laughs)

So I struggle with this because,

obviously in the branding world I have

very high expectations of visuals.

I have to, or I'll be unemployed.

But I find it stressful to be responsible

for those on my own, to be taking pictures,

when I travel all these cool places, and then I'm like,

"Oh god, how do I take a picture of this

"and make it look cool? "I need to be cool."

So Instagram I find stressful, but the last few months

have been better, because I've really been

clear on the purpose that I give my Instagram.

I travel a lot, as some of you know.

And so I use my Instagram to document

my travel adventures around the world.

I keep coming back to Full Sail and hitting you

over the head with the idea that,

if you choose your job and your industry wisely,

maybe you can do it from home.

Or better yet, maybe you can do it

from a hammock in Thailand.

Wouldn't that be nice?

You have the freedom to do it.

I do it.

I was just a few years ahead of you.

I graduated Full Sail, and then decided on

having a career in branding and social media,

which I can do from anywhere, so I do do it from anywhere,

and I use my Instagram to document that, and it's kinda fun.

So those are my three priority platforms.

This is where my audience is.

My audience, sure, some of them are on Facebook,

but it doesn't make the cut for me.

I don't have a Facebook page.

I have a profile where I connect with people,

but for me, this is my focus.

That's the decision I've made.

I am dedicated to making sure I can

grow these platforms by engaging my audience there.

For more infomation >> Top 3 Social Media Platforms for My Brand - Phil Pallen | Full Sail University - Duration: 6:23.

-------------------------------------------

Please vote for the new Voiceroid - Duration: 2:23.

For more infomation >> Please vote for the new Voiceroid - Duration: 2:23.

-------------------------------------------

Birthday Decoration For Children - Çocuklar için Doğum Günü Dekorasyonu (1 Yaş Çubuk Hediyelikleri) - Duration: 5:11.

For more infomation >> Birthday Decoration For Children - Çocuklar için Doğum Günü Dekorasyonu (1 Yaş Çubuk Hediyelikleri) - Duration: 5:11.

-------------------------------------------

Why Puerto Rico will be without power for months - Duration: 4:01.

This is Puerto Rico on a calm night in July 2017.

Here it is again, after Hurricane Maria in September.

The storm's impact has been catastrophic.

It was at its strongest when it was passing over the most populated parts of the island,

which is home to about 3.4 million people.

The impact has been catastrophic.

It knocked off Puerto Rico's power grid and now it'll be months before most of the

island has electricity again.

What's made recovery particularly hard is that the government has no money and that's

partly because of its complex relationship with the US mainland.

Puerto Rico became a US commonwealth in 1952.

But nearly half of all Americans don't know that people born in Puerto Rico are U.S. citizens.

The island's economic downturn can be traced to the 1960s and 70s, when a special tax break

from Congress led US companies to set up shop in Puerto Rico.

Then, between 1993 and 2006, Congress phased out those tax breaks, and companies left the

island in droves, taking thousands of jobs with them.

Economic growth slowed to a crawl, and hordes of people started leaving the island.

The number of Puerto Ricans living in mainland U.S. doubled between 2005 and 2015.

As the tax base shrank, Puerto Rico went into massive debt to pay its bills.

Today, it owes more than $70 billion, mostly to Wall Street creditors

In May of 2017, it filed for protection similar to a bankruptcy.

Budgets for hospitals, schools, and roads were slashed.

Another US policy that partly led to PR's economic turmoil is the Merchant Marine Act

of 1920, or the Jones Act.

It places huge tariffs on any foreign imports.

And although President Trump suspended the Jones Act in the wake of hurricane Maria,

these costs have been passed on to Puerto Ricans for decades.

Since Puerto Rico imports 85% of its food, people often pay double for food and other

basic necessities.

The high cost of living is one of the reasons 43% of residents live in poverty.

Crucially, the island's financial woes have kept it from investing in the kind of modern,

automated power plant technology that's characteristic of the mainland US.

That's going to make recovery difficult.

Most power plants in the mainland rely on natural gas, with some coal, nuclear, and

renewables.

But Puerto Rico's old-fashioned plants still generate two-thirds of the power from burning

oil.

And all that oil has to be imported.

In the aftermath of hurricane Maria, the power plants are mostly intact.

But nearly 80 percent of the transmission lines that carry power are down.

And so are the roads that bring oil to the power plants.

While the island's power plants are on the southern coast, most of the people live in

the north.

And between those two ends sit dense forests and mountains.

Apart from the challenging terrain, Puerto Rico's electric utility company, PREPA,

relies heavily on workers with extensive knowledge — and those workers have been leaving.

Since 2012, 30 percent of PREPA's employees have retired or left for the mainland in search

of better jobs.

Electrical engineers from Puerto Rico can make about 27 percent more money on the mainland.

According to its director, "PREPA is on the verge of collapse because there's no

personnel to operate it."

And that was before the storm.

Without technicians to repair all the broken transmission lines, Puerto Ricans are expected

to be without power for months.

And the consequences of that can be dire.

Without electricity to pump water into homes, it's difficult to find water for drinking,

bathing.

No air conditioning or fans can mean increased risk of heatstroke.

And no refrigerators to keep insulin from expiring.

People are at risk of dying.

All of this means millions of US citizens, for the foreseeable future, will be living

in conditions we usually associate with places very far away.

For more infomation >> Why Puerto Rico will be without power for months - Duration: 4:01.

-------------------------------------------

Wheels On The Bus | Nursery Rhyme | Song For Kids | Baby Rhymes by Farmees S02E248 - Duration: 1:53.

Let's hop on the board on the farm bus.

The wheels on the bus go round and round, round and round,

round and round.

The wheels on the bus go round and round..

All through the town.

The wipers on the bus go Swish, swish..

Swish, swish, Swish, swish,

The wipers on the bus go Swish, swish, swish

All through the town.

The horn on the bus goes Beep, beep..

Beep, beep,Beep, beep,

The horn on the bus goes Beep, beep, beep..

All through the town.

The windows on the bus go up and down,up and down,up and down

up and down,up and down..

The windows of the bus go up and down..

All the through the town.

.The wheels on the bus go round and round ,round and round

Round and round

The wheels on the bus go round and round..

All the through the town.

Come back soon for another ride.

For more infomation >> Wheels On The Bus | Nursery Rhyme | Song For Kids | Baby Rhymes by Farmees S02E248 - Duration: 1:53.

-------------------------------------------

Police Searching For Missing Northwest Indiana Brother, Sister - Duration: 0:26.

For more infomation >> Police Searching For Missing Northwest Indiana Brother, Sister - Duration: 0:26.

-------------------------------------------

The Center for Expanded Data Annotation and Retrieval - Duration: 59:47.

>> Good morning, everyone.

Thanks for joining us today for the NCI CBIIT Speaker Series.

I'm Tony Kerlavage, Director of CBIIT.

Today I will remind you that today's presentation is being recorded

and will be available on the CBIIT website at cbiit.cancer.gov.

You can find information there about future speakers.

You also can follow us on Twitter at our handle, @nci underscore ncip.

I am very happy to say welcome and introduce Dr. Mark Musen,

who is Professor of Biomedical Informatics and Biomedical Data Science

at Stanford University, Director of the Center

for Biomedical Informatics Research.

Mark conducts research related to open science, metadata for enhanced annotation

of scientific datasets, intelligent systems, reusable ontologies,

and biomedical decision support and has the long interaction with folks here

and NCI and National Academy of Medicine.

Welcome, Mark.

[ Applause ]

>> Thanks so much, Tony.

It is really great to be here and to give you an introduction to some

of the work that we're doing at Stanford and with lots of colleges

around the country, all involved in trying

to improve the way people use science.

It sounds like an ambitious goal but we really view this

as an enterprise [inaudible]

such that science is better [inaudible] having scientists creates better data,

more importantly to create better metadata that will allow the rest

of the world to access [inaudible].

And so I'm going to talk to you about CEDAR,

the Center for Expanded Data Annotation Retrieval,

one of the BD2K Centers of Excellence

which was created four years ago, and it's worked.

And I'll give you a flavor for how we believe that through some technologies

that really are not all that complicated, we're going to be able to --

I've been talking to myself.

I'm going to be able to show you how BD2Ks work and CEDAR's work,

I think is going to lead to a situation where we can rely

on scientific data more carefully.

We can be able to annotate those data with metadata

that tell people what those experiments were all about,

and we can use those data more usably.

Before I do that, though, I basically want to start with a parable.

We're having really great technical -- There we go.

I'm going to tell you a story.

I'm going to tell you a story about a colleague of mine named Purvesh Khatri

who is on the faculty at Stanford.

Purvesh is a guy who describes himself as a data parasite.

And what Purvesh does is to basically study what you would affectionately call

"other people's data."

So he really gets excited about gene expression data and has made a career

about looking at datasets from the Gene Expression Omnibus or GEO,

looking at gene expression in a variety of conditions,

being able to look at data from different subjects, collected in different ways,

to try to amalgamate those data, integrate those data,

and draw conclusions that are useful for the ongoing study of a variety

of important clinical conditions.

So his pathway, which I'm not going to go through, that would be Purvesh's talk,

but basically the whole idea here is that by looking

at publically available datasets, by taking advantage of the fact

that these publically collected data are imbued with a lot

of real-world heterogeneity and without having to actually do any experiments

of his own, without having to deal with IRBs directly, he can take these data

and he can search the datasets.

He can find [inaudible] and he can validate within those datasets evidence

of patterns in gene expression which are directly predictive of the kinds

of diseases that he cares about.

What does that mean?

It means that in a complex situation like sepsis,

which usually presents clinically as just a bunch of inflammation,

often without any indication that there's actual infection,

Purvesh has been able to go out to GEO, find relevant datasets,

integrate those datasets, and basically reduce the problem of diagnosing sepsis

to looking at changes in the activation [inaudible].

So Purvesh has an 11-gene signature that is diagnostic --

Those 11 genes get turned on before there can be clinical evidence of sepsis.

It's really a great biomarker.

It's a great set of experiments.

And it all comes from trying to analyze other people's data.

Purvesh has done this over and over again.

He has looked at sepsis problem.

He has only recently published a great paper looking

at the ability to identify active TB.

We all know that we can do tests for TB which will stay active for life,

whether there's infection or not.

He has a set of genomic markers that show you when there's active TB

versus latent TB, which is incredibly important for public health.

He can distinguish between bacterial and viral respiratory infection

and presumably preventing unnecessary use of antibiotics.

He can identify through gene expression patterns

when there's incipient rejection of organs so that for transplant medicine he's

in an incredibly good situation to again have a biomarker created

by analyzing other people's data that is extremely useful

in the clinical sphere, all of this undergoing clinical studies.

Many of these experiments that Purvesh has done have led to a variety

of spinoff companies trying to market some of this stuff.

It's a really great success story.

But Purvesh has been successful for a lot of reasons.

First of all, he goes out to GEO, which is a public repository.

Everyone who does gene expression stuff

with microarrays is required to put data in GEO.

He's able to be successful because he's able to search GEO

and that actually requires a fair amount of finesse,

as I'll tell you in a moment.

He can get information out of GEO.

He can assemble it.

He can basically put it together in a way which allows experiments

and although I would argue that what's in GEO is not necessarily the product

of careful planning, organization, stewardship.

Basically there's a desire on the part of the gene expression community to try

to do as much as they can and those data in GEO are expected

to be reused and he can do it.

Problem that we have and the particular problem that Purvesh has is

that the metadata that describes what's out there

in public repositories are terrible.

That the scientists who are doing experiments don't necessarily view the

creation of the metadata as their job.

They view their job as to publish and get famous.

They will relegate the creation of metadata to a curator, to a postdoc,

and the metadata that are created, as we'll see in a moment,

really are very inconsistent.

Funding agencies are all jumping up and down saying we got to do this.

We all believe in open science.

We also believe in data integration.

We all believe in data reuse.

Got to put your data out there, better describe your data with metadata.

But they don't necessarily provide funding for investigators

to actually do this kind of work.

Creating the metadata itself is really hard and basically to ensure

that the metadata are standardized and searchable is just about impossible.

We spend a lot of time in the bioinformatics world thinking about data search

and mechanisms by which we can find [inaudible] in repositories.

We pay very, very little attention to the fact that the metadata out there

that we're trying to search for are actually pretty crappy.

Look at the gene expression omnibus.

To create a record in GEO, what an investigator has to do basically is to fill

out an incredibly complicated spreadsheet.

And this is just a little piece of the spreadsheet.

And I don't really intend it to be in this color but the projector puts it

in that color, which I guess emphasizes how ugly it is.

And an investigator basically is given almost no guidance

in terms of how to fill this in.

You see very useful keys like sample one, sample two, sample three, sample x,

protocols like growth protocol, treatment protocol, extract protocol,

but basically there's almost nothing to tell the investigator how

to fill the spreadsheet in correctly.

There's some little help tags that pop up.

But as you can imagine, they're not very straightforward.

And most important, there's no guarantee

that this will actually lead to correct metadata.

The investigator fills out the spreadsheet, basically throws it over the wall

and then in a few days, somebody from NCBI will say whether this is okay or not.

If it's not, it's not always clear why it's not okay.

And this process iterates, you can imagine why this is something

that is not something that investigators really want to do.

And you can imagine that it's not something they can do very well.

Go to GEO.

Look at how the word "age" is represented in GEO as an attribute of samples.

Well, it's not just age, it's Age and AGE and Age and age (after birth)

and age (in years) and everything else.

And this is just a convenient sample stolen from GEO.

This is not meant to be exhaustive.

And what it emphasizes is there's no standardization here.

There's no attempt to enforce standardization.

And again, if you're a Purvesh Khatri who wants to do data integration

from existing datasets, there's no search engine that is going

to know all the possible variants of the word "age" in order

to retrieve the records that you might want if your goal is

to put all this stuff together.

So it's a matter of having a problem in inconsistency in representation,

inconsistency in interpretation of what this field "age" actually means,

and for those of us who want to retrieve these kinds

of records, we have a real mess.

And then, you're thinking, well,

GEO is a really old database with lots of old data.

We're doing it much better now.

And so recently we said, okay, we'll look at the newest databases at NCBI

and we said we'd look at BioSample which our colleagues there said,

this is the one you got to look at.

Well, we went to BioSample and actually [inaudible] the first exhaustive study

of what the metadata in BioSample are like.

Well, 85% of the submissions to BioSample actually do not use one

of the predefined packages that are intended to actually structure the metadata.

People just roll their own basically most of the time.

What is really astonishing to us, maybe it shouldn't be astonishing,

73% of the Boolean metadata are not true or false.

They're something else.

Twenty-six percent of the integer metadata don't seem to be integers.

Sixty-eight percent of the metadata which expressly are supposed to come

from biomedical ontologies from our studies don't come

from biomedical ontology represent arbitrary strings.

Are we surprised?

Not really.

There is no metadata police.

There is no one to provide feedback to investigators.

But again, we're in a situation where if our goal is to be able to do the kind

of data integration that for example is written all

over the Cancer Moonshot reports, if our goal is to be able to learn

from other people's investigation,

we're really stymied because we can't even begin to find the metadata we want

when they are created in a way

which makes them fundamentally unsearchable unless you can anticipate all the

possible variants and tricks or, frankly, typos that people are going to use

when they create metadata in these repositories.

And then there's another problem.

Even if our only goal [inaudible] and to reintegrate data,

there's also the whole issue of our crisis in scientific quality

which is permeating everything that we do now

and concern that the lay public has about scientific integrity.

And if we can't have confidence in the datasets that we store online,

if we can't re-analyze those datasets to ensure that protocols as described

in the literature actually were followed,

we're in a situation where our data repositories are not giving us the kind

of defense that we need not only against concerns about scientific abuse,

but simply in our desire to be able to learn from the experiments of others

and to take advantage of existing legacy datasets.

So at minimum, what we need obviously is open science, if you will,

the ability to get open access to experimental datasets,

but we need more than that.

We need mechanisms to be able to search

for the datasets that we are interested in.

We want to make sure that those datasets are annotated with adequate metadata

and that use control terms whenever possible and most important

and this is of course beyond the scope of my work or the work of any one

of us individually, collectively we need a culture in science

that actually cares about this stuff, where it doesn't sound arcane

or it doesn't sound boring because otherwise we're never going to be able

to mobilize people until they recognize that this is really an incredible part

of the scientific enterprise.

Now in the past three or four years,

there's been a lot of hype about the idea of, well,

we have to make our data findable, accessible, interoperable,

and reusable and the FAIR Principles have sort of permeated all data science,

certainly biomedicine ever since a group of investigators got together

in the Netherlands in, I guess, 2013 or so.

The FAIR Principles represent good goals.

Everybody thinks this is what we should be doing.

The real challenge that we have is despite lots of discussions in this area,

we really don't know how to measure findability, accessibly,

interoperability or even reusability, and at the same time we have a lot

of struggle in knowing how to make sure that the data

that we put online are going to be "fair."

Well, let me make some suggestions.

In order to have fair data, at the very least we need to make sure

that we use terms that are standardized, that [inaudible] ontologies,

that we have some confidence and whose semantics we believe in.

And the good thing is, at least

in biomedicine we have probably a longer tradition than anybody in being able

to think about how to structure what we deal with in biology.

I mean, this is Linnaeus whose taxonomy and speciation is still in use.

It's still one of the most important contributions that we have

in the area of biomedical ontology.

Linnaeus, by the way, offered taxonomies of disease and of plant anatomy,

neither of which we use because they were frankly terrible,

which also shows you how hard creating good ontologies actually is.

But we know how to do this.

We know how to use speciation as Linnaeus described it to do a lot

of useful things in biology.

And fast forward a few hundred years

and we have the gene ontology telling us most of what we need to know

about the structure by which we can talk about molecular biology.

We have the NCI Thesaurus, which can tell us almost everything we need to know

about cancer biology and cancer epidemiology and clinical oncology

and all that stuff that you guys know so well.

We have SNOMED which at least can enumerate everything we might want to know

about physical findings and a lot of other clinical conditions

which are well beyond a lot of other biomedical terminology.

We just have a gazillion of these.

And frankly, I will say one of the advantages that we offer at Stanford is

that we [inaudible] amalgamate all of these ontologies into a resource

that we call BioPortal, which we created under the previous NIH program,

the National Centers for Biomedical Computing.

BioPortal currently has some 600 biomedical ontologies that are used

by at least a number of biomedical investigators each.

Some are used by many thousands.

Some are used by many few.

And I think part of our problem, of course,

in biology and medicine is that we have so many authorings

in terms of biomedical ontology.

Some groups have said, well, what we need to do is be very draconian

and identify which are the correct authorings that people will use.

There's a long problem in trying to identify which are the ones that need

to be included among the chosen.

BioPortal tries to be ecumenical and I think we succeeded in our ecumenicity.

We, unfortunately, have too many ontologies for sometimes for people to know

which are the right ones but we find that this is a repository

that just gets enormous use.

We service about a million API calls a day.

We have about 60,000 people who come

and visit our website every month looking for biomedical ontologies.

And we find that having available all these ontologies is incredibly important

for a variety of tasks but we use BioPortal, as I'll say later,

extensively in CEDAR to make sure that when people create metadata,

they're creating metadata using terms from ontologies so that they're able

to describe things in a way that captures the essence of what needs

to be represented and to do so using terms that has some standard foundation.

Second requirement is not just to make sure that the terms

with which we describe metadata are standardized and understandable in some sort

of strict semantics, we want to make sure that we can describe those experiments

in a broad sense completely and consistently.

We need to be able to have standards not just for terms but for the structure

of experiments and to do so is actually not always straightforward.

We want to be able to know by looking

at the metadata what the corresponding dataset is about.

We want to know how can we search for additional data that might be relevant

to the dataset that we're focused on,

how can we group the data into appropriate categories.

We need to know where does these data come from, how are the data produced,

what were the restrictions on the experiment,

what was the protocol that was used, how are these particular data formatted,

where we can go learn more about the data, frankly, where are the publications.

And although we don't say it here, what we'd love to is for these metadata

to be dynamic, to have a temporal dimension,

to tell us when somebody does another experiment

that might either confirm the dataset that was the result of the experiment

that was done, when someone does an experiment that may cause us

to doubt the data where the authors

of the dataset might actually retract their data.

That kind of dialogue that takes place in the interstices

of journals really needs to move into metadata so that the metadata can provide

that same kind of information but that may be another story.

So to make sense of metadata, we need to have standards for the kinds

of things we want to say about the dataset with the language that we're going

to use to represent those kinds of things and how we're going to structure this.

And as I keep saying, we need more than just a technical solution.

We need a culture that really values making experimental data accessible

to others that recognizes that this is really important for science overall.

Now, it's not that hard, actually.

Twenty years ago, the microarray community got together and said, you know,

we're required to put our data into repositories.

We want to be able to understand what one another has done

in terms of experiments.

We need to be able to come up with standard ways for talking

about what the substrate for the experiment was,

what the microarray platform is, what were the experimental conditions,

basically what we need to do, what we need to understand in order

to make sense of someone else's data.

And really without a lot of coaching,

the microarray community got together and created MIAME.

So MIAME is the Minimal Information About a Microarray Experiment model.

And MIAME is not really a model.

It's just a checklist.

It's not a template, just a list of the kinds of things that metadata

about a microarray experiment need to include.

And the most remarkable thing about MIAME is that it was created

by the investigators themselves.

And I guess the second most remarkable thing about MIAME is that it caught on

and that people recognize that if you want to publish microarray data,

you need to use at least these kinds of minimal standards

and GEO when it created its accessioning tools said, well,

we'll use the microarray, rather we'll use the MIAME descriptors

for microarray data as the basis for our data dictionary.

And so when you put data into GEO,

you're basically in a sense using MIAME as well.

And this kind of thing has caught on in biology.

At least when we look at the kinds of experiments that are done in laboratories

when experiments are instrument-centric.

And this website, which is kind of old,

is the minimal information about biological and biomedical investigations portal

that Susanna Sansone created at Oxford a long time ago

and what you can see are a whole bunch of different kinds

of minimal information models that can describe different kinds of experiments

that we might do in biology.

And our belief is that not only are these a good starting point,

but we see the community continuing to evolve

and propose new minimal information sets and we believe these kinds

of structures provide a foundation we're thinking about we can put some sort

of requirements on metadata that can make sure that the metadata at least talk

about the things that are relevant with experiments that they need to describe.

The good news is that, as I said, stuff like MIAME is catching on like wildfire.

If you want to put stuff into GEO, you're basically using MIAME.

The bad news is that I'm a little bit optimistic when I say

that everybody wants to get into this business.

There's still a lot of resistance.

Investigators often will grouse

that minimal information standards are not really minimal at all

and there's still a problem of "what's in it for me."

It takes a lot of work sometimes to convince investigators that there is value

in making their data available to future investigators.

There's a big challenge because many investigators will say this is more work

for me.

No one is funding me to do this kind of stuff.

And more important, if I put this stuff online

and actually tell people what I did,

I'm going to get scooped before I can publish.

But under the assumption that we have an interest in the community

in the minimal information models, assuming that there's interest in continuing

to create biomedical ontologies for the terms that we need

to define good metadata, what we really need is to overcome the "what's in it

for me" problem and we need to make it easier for people to do this

so that the authoring of metadata is no longer tedious or difficult

but something, I won't claim that it's something that could be fun,

but at least something that people would find bearable.

And so really in CEDAR, if we were to describe what our lowest bar is,

our goal is just to make it tolerable for people to create metadata

that are not just metadata but good metadata that are searchable metadata.

And so in CEDAR, we're trying very hard to build a user interface and a workflow

that investigators will find helpful to them in creating metadata

that will bring additional value back to the investigator and also back

to the scientific community.

So CEDAR is our center.

This is our logo.

I come from Stanford, so we like trees and so that's why we have CEDAR.

But we have lots of other colleagues, so folks at the Oxford e-Science Centre,

primarily Susanna Sansone and her team, Steven Kleinstein and his group

at Yale University School of Medicine.

The group at Northrop Grumman responsible for import,

a data repository that has been commissioned by NIAID,

and also we do a lot of work with Stanford Libraries,

which I'll also discuss in a moment.

So all these folks are working together with us on CEDAR.

And our goal is fundamentally to create an ecosystem,

an ecosystem where we have scientists who continue to do their experiments

and generate their datasets.

And we assume in the background that those same scientists

to a certain degree are involved in these community standards groups

that are suggesting ways in which there can be

at the very least minimal information checklist that can be used

to describe different classes of experiments which ultimately can go into CEDAR

in the form of metadata templates, and I'll show you what those templates look

like in a moment, for the purposes of describing classes of metadata

that we can easily store and instantiate and search.

CEDAR users basically fill in the templates

to describe the particular specifications for their individual experiments

and then CEDAR will store the datasets and the associated metadata instances

in whatever public repositories ultimately are the recipients of the metadata.

So CEDAR provides the locus where people can bring together the communities

evolving standards for describing metadata and link those standards

to ontologies and ultimately allow the authoring of metadata that can go

out into the public repositories where we're going to be storing the data

and ultimately doing the kind of work that people

like Purvesh Khatri want to do.

So here's a cartoon that describes what CEDAR is all about.

In the leftmost panel, we talk about the authoring of metadata templates.

So we want to take as input suggestions about minimal information models

and other kinds of [inaudible] standards coming from the community

and turn those into real templates

that one can actually use to fill in the blanks.

In the middle column, we want to annotate datasets with metadata.

So we want to select the right metadata template.

We want to use that metadata template to describe what took place

in the experiment associated with the dataset that we're trying to annotate

and we're going to use all of that to basically come

up with a metadata specification that is complete and sufficiently comprehensive

that further investigators will be able to understand what took place

in the experiment that's been annotated with those data.

In the third panel, we talk about the exploration

and reuse of datasets through metadata.

So two things are happening here.

We are creating our own local cache of all the metadata instances

that are authored through CEDAR.

So in that rightmost panel, you can see the idea of having a metadata repository

that itself is searchable and the idea is we think at some point,

we're not there yet, I think we're too new in this process,

we will have acquired enough metadata and datasets [inaudible]

that we will be able to have our interesting ability to search those metadata

and look at patterns, patterns in how people do experiments,

patterns in how science evolves over time as people attempt

to do different kinds of activities.

But what we do in the short term is that the metadata that are collected

in that repository you see on the right are used to allow us

to identify patterns in existing metadata which can go back

into the middle panel and inform metadata authoring, as you'll see in a moment.

By knowing the patterns in the metadata,

we can make guesses about future entries that authors might make

as they create metadata in that second panel.

And our belief is one of the ways we can make metadata authoring more palatable

is by doing as much predictive data entry as we can

so that the burden is taken off the investigator and a lot of that burden is put

on the computer to help make suggestions about how to fill in the blanks

and you'll see that in a bit.

So basically the CEDAR workbench is providing us with the ability

to author metadata templates that reflect the community standards

that we're monitoring, to fill out those templates

to actually encode experimental metadata, gives us a repository of metadata

where we can learn the patterns in the metadata that allow us

to inform future metadata authoring,

guide their predictive entry of new metadata, and also then to ship the data

and metadata out to the repositories where they are going to be stored.

And all of this is going to BioPortal to ensure that all of this is,

if you will, ontology aware.

And this is what CEDAR looks like.

So here we have the main part of the CEDAR workbench and this is organized

in a variety of ways but this user has a series of different elements

and what we can do is we can go to any one of these particular templates

or groups of templates and begin to either fill them in

or edit them and refine them further.

For example, if I were to say I'm interested in BioSample human, I can say,

well, let's go populate, if you will,

the template that is used to specify entries

into the BioSample NCBI database from human samples.

And if I do, I can get a free sample,

take a look at a template that I've been trying to fill in.

You see that sample 56 comes from Homo sapiens.

The tissue is skin of body.

It came from a male who's 74 years old,

who the biomaterial provider was Life Technologies.

This particular subject had a disease which has the value of dermatitis.

It came from actually from a cell line.

You can see information here and so on.

These are the metadata that in this case described the sample relevant

to the study that was done in this particular case.

And so we have a template where we have attributes on the left,

the corresponding values on the right, and I would argue that wherever we can,

descriptions of the attributes are coming from ontologies.

The values are coming from ontologies.

[inaudible] want to see how we actually get there.

Well, creating the template, which takes place in that left panel

of the cartoon, is one where we have a basis for the template

and we enter what are the various fields.

Do I have a sample name?

We might say the sample name is going to be alphanumeric entry.

We might say that it's going to be an organism that's going

to be alphanumeric entry, a tissue which is going to be alphanumeric entry.

Here we say the tissue also has a field that can be used as a description

of that particular entry in the user interface of the metadata acquisition tool.

What we see is we've blown out the bottom here and you can see for example

if we want to describe what kind of values will be used to define tissues,

the user has selected that funny little triangle on the left saying the values

for tissue need to come from an ontology.

At this point we're not saying what ontology but we're saying

that we don't want this to be an arbitrary type of an alphanumeric.

We want this to be a selection of an ontology term.

And so we make that entry and then what CEDAR will do is say, okay,

let's go to BioPortal which has the most comprehensive list

of all the biomedical ontologies we know about and say,

what are the ontologies that talk about tissues and you get a list.

Well, let's see.

There's an ontology called Uberon.

Uberon is an ontology that deals with a consensus view of vertebrate anatomy.

Uberon is given as potential source of tissue.

It has a -- And you can see what the definition an Uberon actually is.

You can see that the MA, the mouse anatomy ontology has information

about tissues.

We're probably not going to want to use the mouse anatomy ontology

for a BioSample human entry but it's there.

NIF standard, the neuro information framework ontology talks about tissues

as does the [inaudible] anatomy ontology,

which we're probably not going to use either for human samples.

But right now, Uberon is looking pretty good.

And if I want to, I can use the CEDAR workbench to say, okay,

if I choose Uberon, what am I getting myself in to.

What does Uberon actually say about tissues and so you can just see

within CEDAR the branch of Uberon that talks about tissues yet it stands

for how well put together it is

and you can decide whether the Uberon tissue branch is

where all your selections should come for the user who wants

to find a tissue type when defining metadata.

Okay, so having done that when it's time to actually fill in a template,

I can say, for example, my sample name is going to be 56.

My organism is going to be Homo sapiens,

which is obviously BioSample human again.

And our tissue is going to be, well, we get a dropdown.

And what's interesting about the way CEDAR works,

you can either type in what tissue you want and see what is

in the Uberon ontology since the previous template author has said Uberon is

going to be the ontology we're using in this situation or you get a dropdown

and what you can see is not only the list of tissues that are

in the tissue branch of Uberon but you can see the frequency

with which those particular selections were made by previous metadata authors

when they were entering their metadata into CEDAR.

And so you can see that in the past, 50% of the people who are defining tissue

for the purposes of creating a BioSample human metadata specification choose

blood as the tissue.

Sometimes this can be a little as silly

as Amazon telling you what book you should order

but at the same time getting a sense

for what other authors have selected is useful and more important,

we can sort the list in the dropdown menu and can [inaudible] with the frequency

with which those selections are made in the database in the context

of the entries that have been previously made on this metadata entry form.

And so we can simplify the metadata entry process

because usually what the user wants, particularly given the previous context,

is going to be the top of the dropdown list.

Ordinarily, selecting a tissue from a dropdown could be a pretty onerous task

because of all the selections that one might make.

And here we know it's going to be near the top most of the time

and that's actually pretty reassuring.

We do other tricks like this that are even more effective.

So for example, here's BioSample human again.

And in this case we're saying the tissue is lung

and we want to specify what is the disease that we want to specify

for this particular metadata entry.

Well, we could just give you the whole list of diseases in SNOMED.

That's probably not where we want to go.

Instead, we go to the previously entered metadata in CEDAR

and we look at the frequency with which particular diseases have been specified

in the specific context described by the user.

And it turns out that when the tissue type is lung,

then 61% of the entries in these metadata are lung cancer.

And so the entry that you most likely will want is up at the top

of the list and you can select it.

You can see lung cancer is number one.

COPD is 31.

Squamous cell carcinoma of the lung is down at 5% and so on.

Now you may argue that having a very generic term

like "lung cancer" probably is not all that helpful in creating metadata.

It's certainly better than having lung cancer misspelled and, in fact,

what we're doing here is not necessarily being prescriptive

in saying what the most definitive way in talking about a disease should be

but saying this at least is the most common way in which your peers have talked

about diseases in this context.

And if you want to follow what your peers have done,

this is the selection that you might make.

This is what happens if the tissue is lung.

If the tissue is brain, you get a different dropdown list.

You get the number one entry on the list is Parkinson's disease.

CNS lymphoma is number two.

Autism is 22.

This is not necessarily because these are the most common diseases

in the population.

These are not necessarily because these are the most common diseases

that investigators study.

It's simply because in the database of metadata that CEDAR has,

these are the most common diseases that people have used

and this is the frequency with which they occur.

And our belief, of course, is that as we get more experienced with CEDAR,

as we get more entries into the metadata repository,

we'll be able to make even better suggestions with more granularity,

as we just get more metadata and we can learn from them in more definitive ways.

So the idea is in the left panel,

we can author metadata templates based on community standards.

The middle panel, we can fill in those templates with particular information

to author metadata entries.

We create our local repository and then we ship those metadata off.

And so what's very exciting for us, our colleagues at Yale have gotten a lot

of experience recently taking the metadata that they are creating

for basically sequence studies involving human immunology

and automatically taking their datasets and sending them off to BioSample

and SRA and a variety of other NCI databases by simply creating the metadata

with CEDAR, linking them to their datasets, and pushing the button.

And we're hoping that this kind

of an approach will ultimately be a lot more palatable and a lot more enjoyable

to investigators than filling out a complex spreadsheet

and then just throwing it over the wall.

So when we talk about all this stuff in CEDAR,

it's important to remember that under the hood there's a lot

of stuff going on here.

And this is more than just a bunch of user interfaces and APIs.

Basically all the semantic elements of CEDAR, the elements of templates,

the overall templates themselves, ontologies, value sets,

these are all managed as first-class entities.

Most of those are stored in BioPortal.

We have a user interface that takes advantage of all of those semantic elements

and basically gets generated on the fly.

So we don't have any predefined menus or windows.

Basically everything you saw in those screen shots is generated programmatically

from the template that needs to be filled in,

from the ontologies that need to be used, from the previous metadata entries

from which we're generating possible menu selections.

All the software components have APIs that allow us

to have secondary programmatic access to CEDAR,

so you don't actually have to be CEDAR workbench user.

You can build your own technology and link it to CEDAR.

So for example, we have a collaboration with Elsevier

who have linked Mendeley data to CEDAR.

And although they want to use the Mendeley data user interface,

they can get access to CEDAR's directly through our API in a very effective way.

Actually it was very exciting for us,

the Elsevier folks were able to build their connection

from between Mendeley data and CEDAR in an afternoon which really stunned us.

And everything in CEDAR is output in JSON-LD,

which is an industry standard representation for semantic information.

There are lots of choices here.

We thought that JSON-LD probably has the greatest uptake at this point

in the commercial community.

So that's why we made that decision.

I just went in the wrong direction.

Okay, so what we're doing in terms of dissemination is to try to get

as much experience as we can in having collaborators help us use our technology,

help us evaluate this technology

and obviously also help us proselytize it, we hope.

So we have a limited amount of [inaudible] from our BD2K grant

and we are working with a number of trusted colleagues who are beginning

to put CEDAR use in their particular environments,

and I'll talk about those in a moment.

This gives us the ability to look at CEDAR in the context of different workflows

and with different requirements on the metadata themselves.

Certainly, we're interested in pursuing new strategically important

collaborations but which we would obviously need additional funding,

but that's clearly on our radar screen.

And on the recommendation of our scientific advisory board,

we're about to start doing something we've never quite done before.

We're not really going to do hackathons but we're going to have sessions

that are called "bring your own data" and these BYOD sessions,

which may not be as much fun as BYOB sessions,

but in any event will give us the opportunity to get people in the same room

and see what we can learn from their attempts to use CEDAR

and hopefully give them the opportunity to try out their datasets

in our particular framework.

I'm going to talk a little bit about some

of the ongoing collaborations we have right now.

From the very beginning we saw ourselves as wanting to plug ourselves

into an ecosystem with ImmPort.

So ImmPort is a data repository that's been under development for more

than a decade by the Division of Allergy and Infection

and Transplantation at NIAID.

The requirement nominally is that all DAIT-funded studies need

to have their data put into ImmPort.

And I don't think that's actually being enforced at this point.

But ImmPort represents a repository with a single kind of metadata descriptor

for all the studies that are being put there.

We are working very closely with Northrop Grumman

who had the development contracts for ImmPort and our goal is that really

by January timeframe we expect them to be working with us to actually use CEDAR

as the basis for authoring the metadata descriptors

for the data going into ImmPort right now.

ImmPort has been really interesting for us

because obviously we have primary funding through NIAID,

so we want to keep NIAID happy.

It turns out there are many complexities in their workflow

and frankly there aren't that many datasets going into ImmPort.

So there're some pluses and minuses in that setting which has caused us to want

to identify how we can use CEDAR in other kinds of investigative opportunities

and we that has led us to a collaboration with LINCS

which complements the ImmPort collaboration nicely.

So LINCS is a large consortium of NIH-funded investigators.

They're funded by the Common Fund.

They have five large investigative units around the country

and they currently have a data coordination integration center

with whom we collaborate at the University of Miami.

Frankly, I've not found out this week how our folks in Miami are doing.

I imagine they have complications.

But the DCIC is working with us to work

on a system whereby the various groups affiliated

with LINCS will be submitting all of their data,

which is a variety of data related to cell signaling studies,

into a local repository that is being maintained at the University of Miami.

And so unlike our association with ImmPort, which is their own legacy repository

which has been around for ten years,

LINCS is in the process of building their whole data repository right now

with their own metadata component.

We will be providing the mechanism whereby the LINCS collaborators will author

the metadata that will be annotating the datasets that go into that repository.

The goal was to have that beginning field testing in the next month or two.

I'm not sure actually what's going to happen after Hurricaine Irma.

We mentioned that Yale is a good collaborator with us.

Steve Kleinstein at Yale leads a group of investigators

in the Adaptive Immune Receptor Repertoire Community basically looking

at sequence analysis, T cells and B cells and looking at way

in which NGS technology can help us understand better immune response.

What's really exciting is that by January,

the AIRR community is going to be working with us to get all of their data

up into NCBI databases and that's why Yale is working so hard

and making these extensions to CEDAR to facilitate the translation

of CEDAR metadata into SRA and BioSample and BioProject which is just very,

very exciting from our perspective.

And I'm going to mention this because it's kind of surprising.

But we've got a lot of excitement from our library at Stanford which is involved

in a consortium called LD4P.

It's a very exciting consortium, not a very exciting website.

But LD4P is Linked Data for Production and they're involved

in basically the enormous transition that's taking place

in the library community as MARC,

the Markup language that was for bibliographic metadata that came

out in the 1960s, is being jettisoned for something called BIBFRAME

which is a link to open data kind of representation that is being promoted

by the Library of Congress and suddenly the whole acquisitioning

of digital objects for research libraries is being turned upside down

and LD4P is working with CEDAR, testing out the CEDAR workbench at Stanford,

at Columbia, at Princeton, and at Cornell as a mechanism by which [inaudible]

of materials for research libraries which is not exactly the kind

of scientific metadata creation that we had in mind initially

but I think a good use case for us and something which we find pretty exciting.

Which brings us back to the whole problem

of how do we make all these metadata fair,

how do we ensure that we can actually get the buy-in from the people

in the broad scope who are going to be able

to help us change the scientific enterprise,

to make people like Purvesh Khatri find that they can actually search for data

and find them by having metadata that actually meets the kinds of criteria

that are embedded in the CEDAR project.

Well, if we can make it easy and palatable to use community-based standards

and community-based ontologies to author metadata that are complete

and comprehensive, then I think we're going

to see major changes just the way we do science.

I think we're going to be able to see increasingly the reliance on ontologies

to enable critical representation of scientific knowledge

and ultimately we'll be able to formalize a lot of what we know in science

in terms of ontologies, very much in the way the NCI Thesaurus has played

such an important role here.

We're going to be able to see the community-based standards

that people are developing work their way into formal templates

and ultimately see those templates become the way

in which we standardize the communication of scientific knowledge

and I think what is very, very exciting to me,

even more exciting to people like Purvesh,

is that if this whole enterprise is successful,

then ultimately we're going to see the dissemination

of scientific results not just as prose journal articles but as the kinds

of datasets and associated metadata that will allow us to know

with great precision what people are doing in their investigations

and not only to be able to know what those investigations were

and what the conclusions of the investigators were

but to actually see the results and to verify the results

and we integrate those results with other experiments.

And I think someone who has a long background in AI,

the ability to have intelligent agents that are going to be able to look

at these online datasets and these online metadata collections

to me is just incredibly exciting.

It means that someday we'll close our eyes

and it'll be like a Siri-like agent that's going to tell us whether new studies

that we should be looking at, how we can compare and contrast different studies,

maybe our results with someone else's results along dimensions

that might matter, how we can identify the characteristics of a set

of a study population that would allow a clinician to know

which clinical trial has the subjects that best match the subjects that he

or she wants to treat right there,

to be able to identify what are the most relevant clinical trials on the basis

of data that come out of these online repositories rather

than from the traditional table two of clinical trials which may

or may not be nearly as comprehensive and ultimately for the intelligent agents

to be able to tell investigators when someone else has done the study just

like that on or to be able to make sure that when there are follow-on studies,

those are identified and pulled immediately for evaluation.

I think what's very exciting is that we're moving to a direction where maybe

in our lifetime, technologies such as CEDAR will make the primary publication

of scientific knowledge the data and the metadata, not the prose,

and then ultimately not just people working by themselves but people working

in concert with intelligent agents are going to be able

to have those agents read and return the literature to them

when important issues come up.

Those agents will integrate information.

They'll be able to track scientific advances.

They'll be able to help re-explore existing datasets and do data integration.

And I think what will ultimately happen is that we'll have computers

that can suggest what is the next line of experiments to do

and if you believe some of the work in robotics going on now,

maybe those intelligent agents will actually do the experiments for us.

But in any event, before we get there, we have a long road to hoe.

We definitely have to move forward in ways to make our metadata better.

We have to make our scientific results more accessible.

And that's basically our goal in working on the CEDAR workbench

to bring all this stuff together and to make it so that

when we put our datasets online, we actually can come back to them weeks

or years later and make some sense out of what we've done.

Thanks.

[ Applause ]

>> Thanks, Mark, for that presentation.

If you're here in the room and have a question,

please use one of the mikes on the left side of the room.

If you're on WebEx, please use the raise hand feature in the dashboard

and we'll unmute your line.

We'll start with Mark.

>> Thanks.

So this is amazing and wonderful.

Can I play with it?

>> Yes.

>> Can I create some templates and put them in

and then add some data and see what comes out?

>> Today.

>> Today we could do it?

>> So the general CEDAR website is not --

Oh, there it is, metadatacenter.org and if you go to cedar.metadatacenter.,

I don't remember what extension we used,

it'll be it's a link from the main site.

You'll be able to go directly to CEDAR.

You can make yourself an account.

You can create your favorite metadata template

and see how well it works for you.

I won't say operators are standing by but if you need help, give us a call.

>> Great presentation.

Just curious in terms of have you approached the GEO folks to see

if they're interested in using CEDAR?

>> We haven't talked to the GEO folks directly.

We've talked to a bunch of folks at NCBI.

Ben Busby is probably our primary contact there.

And of course he's telling us we'll focus on the BioSample and BioProject

and all the new databases because GEO is so old that it's hopeless.

I think ultimately once we have CEDAR in a more robust state,

and once we have more experience in particular with GEO datasets,

we'll be able to provide guidance for people who want to use CEDAR

to author GEO metadata and offer a better result there.

The problem with GEO is that it's so old,

it has enormous amounts of legacy stuff

and then I won't say microarrays are falling out of favor,

but there are other technologies now and so we're being pushed more

to the newer data repositories.

>> Thanks, Mark.

So you've focused on talking about providing information about datasets.

That's the term you've been using a lot and I won't get into,

we shouldn't get into semantic discussions about what we mean by dataset

but we can define one data at one level if I'm submitting data to GEO

about a study and say this is a study about such and such disease

and you provided examples where you would be providing a list of diseases.

But the usability of data in a dataset is often going to depend

on whether the right terms have been used for gender

of each individual patient that's in there.

So in order to manage that, one's got to have a form

where you're submitting data about each individual patient

from which you might have a sequence.

Personally, I would regard the gender as data rather than metadata

but somehow we've got to talking about all those things as metadata.

But in what you're talking about here, do you see this being a,

which level do you see this being applied at, more at this level of a study

or up at the level where we're collecting individual data

from about individual subjects, let's say, be they mice,

humans, samples, whichever?

>> Well, to paraphrase the composer, anything you can do, I can do meta.

And there's obviously ambiguity as to what is going

to be the correct level at which to be working.

One of the reasons I was not crisp in defining dataset in my talk is

that the metadata in CEDAR allow you to create those kinds of annotations

at different levels of abstraction and you can associate metadata

or descriptors with data in certain ways.

Now the issue that you're talking about comes up when you may want

to describe a biological sample as coming from some organism with sex male

to which a lot of data may apply versus, for example,

a population study where you want to be able to have at the level

of the datum a description of an individual subject who's gender or sex matters.

One of the things that we can do in CEDAR is to facilitate those descriptions

at different levels of abstraction and we recognize that often one wants

to be able to take data and group them so there may be metadata

for a particular sample, then metadata that defined a collection of samples

in a study, metadata for a collection of studies

that might be relevant to an investigation.

We want to make sure that we can handle that kind of situation as well.

>> I get that the ambiguity and there's the spectrum,

what I was really looking for is do you think

that is there any executive decision about saying where along

that spectrum one is, how much is it reasonable to bite off

and where should we stick the stake in the ground,

do you have an opinion about that?

>> I guess the answer is computationally it doesn't make that much difference

because I think from the way in which CEDAR operates,

the way one creates templates and assigns templates to datasets,

we can do that recursively, without having to really worry

about what level we're operating at.

I think it's a much scientifically is a much different question in terms

of what is a unit of [inaudible] unit of study

and those are different questions.

>> I see the biggest challenge in that is who and when could you

because you've described this as it's a pain to do this, right,

so the question is, is how many do we want this being done day to day in labs

or is it done once at the end of the study.

>> And I could -- And the answer to that I think is it depends.

Like everything else that we're dealing with,

it depends on where things fit into workflow.

Ideally when we're dealing with instrument-based experiments or many kinds

of other kinds of laboratory-based studies, we want those metadata to be created

where the data are created at the bench.

In the best of all possible worlds, we wouldn't have a formal process

as in CEDAR where one has to fill in those blanks in the template,

we would have CEDAR embedded within electronic laboratory notebooks so that

when the data are managed by the investigator or by the postdoc,

basically those annotations are created as just part of doing the work

and that they get automatically sucked into CEDAR

so the appropriate metadata can be put

in the ultimate representations that go online.

Basically our longer term goal is to remove the workbench as sort

of these intervening step and to be able to tie this kind

of metadata authoring directly into what happens at the bench

but that's a long way down the road.

>> I'm afraid we're at the top of the hour.

I do want to alert you to our next presentation, which will be on Wednesday,

October 11th, and Anant Madabhushi

from Case Western Reserve University will be presenting at the Speaker Series.

I want to thank everybody who's joined today and let's thank Dr. Musen once more

for a terrific presentation.

[ Applause ]

For more infomation >> The Center for Expanded Data Annotation and Retrieval - Duration: 59:47.

-------------------------------------------

Visa-Free Travel for Visiting Filipinos - Duration: 0:53.

For more infomation >> Visa-Free Travel for Visiting Filipinos - Duration: 0:53.

-------------------------------------------

Losing your motivation for IELTS. Watch this! - Duration: 11:38.

For more infomation >> Losing your motivation for IELTS. Watch this! - Duration: 11:38.

-------------------------------------------

Could it be all change at the top for Newcastle 26 years on from miserable milestone? - Duration: 5:56.

Could it be all change at the top for Newcastle 26 years on from miserable milestone?

It will go unrecognised by the vast majority of fans. Even the clubs own historian may not be aware of its significance.

But October 5 1991 was a special day for me. I became a father for the first time when my wife Sue gave birth to James, the first of our three sons.

It also marked the day Newcastle slumped to the lowest position in their long history. They lost 3-1 against Portsmouth at Fratton Park, a result which sent them crashing to the bottom of the old Second Division.

This was the season before the launch of the Premier League. The old Division One had 22 teams and Division Two had 24. So at 5pm that Saturday afternoon, Newcastle were 46th in the Football League. They have never been lower.

This was a time of great change at St James Park with Sir John Hall and his Magpie Group in the throes of taking control from the old board.

By the end of that 1991-92 season, Ossie Ardiles had been sacked, Kevin Keegan was manager and Newcastle avoided relegation on the last day of the campaign. Within 12 months, Newcastle had won promotion to the top flight.

Two years later, they were lauded as The Entertainers.

Before the 20th century was out, theyd finished runners-up twice, reached two FA Cups, smashed the world transfer record to bring Alan Shearer back home and transformed their home into one of Europes most impressive footballing citadels.

Freddy Shepherd, who died last week, was one of the key architects behind the clubs remarkable renaissance. I kept in touch with Freddy long after hed reluctantly vacated his role as Toon chairman following Mike Ashleys buy-out.

And so there was a lump in my throat during a rousing minutes applause before Sundays 1-1 draw against Liverpool just as there was during an immaculately observed minutes silence at Kingston Park on Friday night when Newcastle Falcons - another successful Hall/Shepherd project - went to the top of the Aviva Premiership.

Many of those fans who clapped and even sung Freddys name may have been the same ones who celebrated when he left the club. But time has revised opinions.

The Toon Army now accept that 1992 to 2007 was a rich period for their club, albeit an unfulfilled period.

Hall and Shepherd dreamed and delivered - on the pitch and off it. The magnificently renovated St James Park stands as Shepherds legacy.

Two relegations in the decade since his departure acts as a warning to what can happen when austere balance sheets and politics eclipses passion and ambition.

How ironic that in the week of his untimely death, it looks as if the unloved Ashley era could be coming to an end. The presence of entrepreneur Amanda Staveley at the Liverpool game does not indicate a fresh takeover is imminent.

But it is the latest and most visible example of forces at work behind the scenes which could lead to new ownership. Staveley, 44, runs the £24bn private equity fund PCP Capital Partners, an investment vehicle for mega-rich investors.

And while talks have yet to take place between Ashley and her, it is understood she is looking to move into football. Freddy would love that.

He bit his tongue on many occasions when asked to comment on Ashleys running of his beloved Newcastle, letting rip only once, when SJP was suddenly and outrageously turned into the Sports Direct Arena.

But he craved a return to the dash, dare and deliverance of the Hall/Shepherd years. Anyone with the vision, ambition and wealth to fulfil Newcastles potential would be welcomed with open arms by him.

Freddy Shepherd gave it a go. Years on from the Magpie Group rolling the dice, are Newcastle on the verge of another exciting new era?.

For more infomation >> Could it be all change at the top for Newcastle 26 years on from miserable milestone? - Duration: 5:56.

-------------------------------------------

Finland - prepared for crisis - Duration: 3:44.

This is Helsinki, Finland's capital.

A city of culture, statues and seagulls.

But beneath the surface, Finland has one of the most comprehensive disaster response systems

of any country in the world, and is sharing its expertise with international partners,

including NATO.

It is a country prepared for any crisis.

We are prepared for natural disasters, floods, forest fires.

And man-made disasters, terrorist attacks, even the worst-case scenario of war.

And Finland's attitude towards crisis response is born out of experience.

War planes darkening the sky of Finland...

We lost 90,000 people during the Second World War.

That was a huge loss in this small country.

So everybody understands why it is important to be prepared in every possible crisis.

After Helsinki was bombed repeatedly during the Second World War, the authorities in Finland

put measures in place to make sure that if it happened again, the nation would be ready.

Through these doors, down this elevator, is what looks like a regular children's play

area, gym and sports centre.

However, this is also a custom-built air raid shelter designed to house 6,000 people for

up to a week in a crisis situation.

Overseeing it is this man, Jari Markkanen.

This cave is 230 metres long.

Blast-proof doors.

Plenty of entrances.

This kind of toilet facility.

Beds for over 600 people.

In Finland, every building with a floor area of more than 1,200 square metres is required

to have a shelter to protect citizens.

But underground shelters are just one of the civil defence measures.

What we're doing here is an exercise involving police, fire department and emergency medical

services in order to maintain and build up the maximum effectiveness for real-time emergencies.

At the end of the day it always comes to rescuing people manually.

We need to know our environment to be able to function effectively.

We are playing with time.

Finland's comprehensive approach to civil security also involves training key decision-makers

and business owners in what to do if a crisis situation occurs.

These resilience measures are a priority for Finland, a NATO partner, and NATO Allies alike

and they have long worked together to prepare for crises and disasters.

The opening, in Helsinki, of a European Centre for Countering Hybrid Threats will further

strengthen cooperation.

NATO has great standards for civil preparedness and we are following those and comparing with

ours and sharing information in both ways to improve resilience capabilities.

While underground shelters and crisis response measures are a last resort, in times of increasing

uncertainty where cities across Europe are on heightened alert, if a crisis does strike,

Finland will be ready.

For more infomation >> Finland - prepared for crisis - Duration: 3:44.

-------------------------------------------

UW researchers are improving treatment for people with kidney disease - Duration: 0:44.

Dosing patients with kidney disease is complicated. But the problem with kidney

disease and liver disease is that those diseases are dynamic, which means in many

patients they continue to get worse over time. So we can't pick a dose for a

patient on one day and expect that dose to be the same a year or two years

or 10 years from today. Kidney-on-a-chip, I hope, will replace a lot of the current

testing methods that we use in drug development for both trying to

understand if drugs can treat kidney disease and understanding whether drugs

can harm the kidney. As a pharmacist my goal is always to make the life of my

patients better. And the more we understand kidney disease,

the better their lives will be.

Không có nhận xét nào:

Đăng nhận xét