Create a gist now

Instantly share code, notes, and snippets.

Embed
What would you like to do?

Samvera#General talking about Fedora

Tuesday, August 29, 2017

Mike Giarlo (5:56 PM)

Have folks here been hearing all manner of rumors today about Samvera, or certain Samvera institutions, walking away from Fedora and other community components? Some of us are hearing these rumors as of a few hours ago, and we’re trying to figure out where the misinformation is coming from.

It seems to center on Valkyrie. We did discuss Valkyrie and Fedora futures on today’s Fedora Leadership group, but not in the context the rumors are in.

Jonathan Rochkind (5:58 PM)

i have not. I have been hearing about that for months vaguely, but nothing special new.

seems like the valkerie stuff could lead to it though? Based on slack discussion over the past few weeks, I got the impression that there were several institutions hoping valkerie would take them away from fedora.

Mike Giarlo (6:00 PM)

Could, to the extent that Valkyrie allows for greater flexibility w/r/t persistence.

Like, I doubt the community would invest as much in Valkyrie if we wanted to ditch Fedora. We’d just…​ ditch Fedora, instead of putting cycles into allowing folks to use Fedora via a different adapter.

This was a fairly specific rumor, so I doubt it’s the continued excitement around Valkyrie.

Trey Pendragon (6:02 PM)

I haven’t heard a rumor.

There’s been rumblings about misinformation about Valkyrie in the data mapper WG, so we’re generating an FAQ.

It’s in progress at the moment: https://docs.google.com/document/d/1JITfW6FlaHS1pM5mVwxx5Eeac_4xXBxhooxlvXPhe7I/edit, but when it’s done we intend to post it as part of the project to hopefully try and clear up some of the questions.

See #5, #8, #9, #10

Jonathan Rochkind (6:04 PM)

I would like it if the community ditched fedora, but that’s just me, and it’s not a rumor, haha.

Mike Giarlo (7:04 PM)

To dispel some of the rumors I’m hearing and set the record straight:

  • Stanford does not have any plans to move away from Fedora; we continue to actively invest in Fedora-based solutions.

  • Valkyrie does not represent a fork in the community. Once Valkyrie is ready, we will come together to support it in Hyrax, and it will continue supporting Fedora.

esmé cowles (7:37 PM)

Princeton is also going to continue supporting Fedora-based solutions, and I’m actively working on improving the Fedora API specification, and I’m signed up to work on the API alignment sprint in a couple weeks

Trey Pendragon (7:50 PM)

@mjgiarlo Feel free to update that Hyrax question in the FAQ.

It mostly says what it says now because none of us there knew what the plans were.

Mike Giarlo (8:41 PM)

@tpendragon Thanks!

Wednesday, August 30, 2017

hackmaster.a (9:37 AM)

I don’t know that there is a united front on this in the community right now. The fact that we’re working to continue to support Fedora could mean that we all continue to be super committed to Fedora, or it could mean that some of us continue to be super-committed to Fedora, and others don’t, but still value the community.

Like, I doubt the community would invest as much in Valkyrie if we wanted to ditch Fedora. We’d just…​ ditch Fedora, instead of putting cycles into allowing folks to use Fedora via a different adapter.

— Mike Giarlo

+ Not sure that has much bearing on the conversation, but I feel it’s important to acknowledge that the community comprises a diversity of opinions and say that that’s okay.

esmé cowles (9:40 AM)

@hackmaster.a i think that’s a big part of this — i think it’s good to have a community that welcomes different technologies, use cases, etc.

and there have been non-Fedora folks around for a while (e.g., UCSD and DPLA)

Jonathan Rochkind (9:41 AM)

I think that’s a really good point.

esmé cowles (9:41 AM)

but i think Valkyrie widens that quite a bit, so i’m not surprised it’s causing some confusion

Jonathan Rochkind (9:42 AM)

We do not all need to do the same thing, and still can be part of the same community!  To be sure, there are potential theoretical advantages from us doing the same thing, and being able to share more code. But there are also real disadvantages, and advantages to a diverse ecosystem.

We are a community because of how we relate to each other, not because we’re all using the same code.

hackmaster.a (9:43 AM)

searches for :puppies_and_kittens_group_hug:

Jonathan Rochkind (9:44 AM)

(we already aren’t btw. I really got to polish off my side project that gives reports on what versions of hydra/samvera are in use, analyzing open Gemfile.locks from community institutions. But it’s very diverse, nearly every major version of every community-maintained gem is still in use by someone, and for many of them the majority are not on latest version and the distribution is pretty flat accross historical versions still in use)

Jonathan Rochkind (9:44 AM)

maybe I’ll do a ugly ascii-text report on my blog, just to capture the info. I want to turn it into a nice auto-updating html thing on the web.

On the other hand, the move from "hydra heads' to sufia/hyrax standardization shows that some people realized they were doing things that weren’t sustainable because of lack of consistency.  There is no magic bullet, and no way around individual organizations making engineering decisions themselves, instead of just trying to "do what everyone else is doing" .  hydra heads were of course "what everyone else was doing" at one point too. (edited)

esmé cowles (9:49 AM)

yep, i think we’ve been wrestling with these issues as long as i’ve been part of the samvera community (~5 years):* lets share building blocks but all do our own thing* too much reinventing the wheel, let’s extract good stuff and share more* maybe we can all use sufia?* sufia doesn’t do everything i want, let’s add all our use cases to itetc.

(this is all from my perspective, but i definitely think we’ve been trying to balance heterogenous use cases with sharing as much code as possible, with various approaches and varying amounts of success)

Jonathan Rochkind (9:57 AM)

yup, agreed. We keep going back and forth on the pendulum.

I think there are significant institutional pressures in many of our libraries to "just do what everyone else is doing" (not just on samvera but generally, but maybe especially on technical decisions).  Sometimes it makes sense to do what everyone else is doing, but it’s not safe to just do that without considering what makes sense for your business and user needs.

but most libraries still don’t want to accept that they are IT organizations that need capacity for making IT  decisions. Thus "just do what everyone else is doing" (edited)

Michael Klein (11:55 AM)

This might be a pretty naive question coming from someone who’s been around this community as long as I have, but other people keep asking me, and I’m having a hard time coming up with a concise answer:

  • What benefits does Fedora (either as an API or as the fcrepo v4.7.4 reference implementation of that API) currently provide over any other hybrid RDF/binary datastore?

  • I mean, I understand that it can be configured out the wazoo with messaging and alternative storage backends and all that, but in terms of all the things we talk about when we talk about preservation "fixity, format migration, packaging, access control, etc. "what do we get, and what did we gain/lose in the move from 3.x → 4.x?

Jonathan Rochkind (11:56 AM)

@mbklein it is not a naive question. I personally don’t think it provides any. It is not even an RDF store, really.  I don’t think others agree with me, but I have also seen no particular case for what it provides.

Jonathan Rochkind (11:57 AM)

@mbklein I think that is actually the secret dark side, I don’t think fedora is providing much of anything for us, at least the way we are using it. Except pain.

sometimes people mention the "fixity checking" . Ie, taking a checksum and storing it somewhere. Ie, something that could be implemented in any system at all in a couple hours.

Michael Klein (11:58 AM)

Speaking ONLY for myself: I am concerned that we (NU) are using Fedora because OF COURSE we are using Fedora because we are part of the community and dedicated to using/improving/supporting the products the community creates.

#TautologyDrivenDesign

esmé cowles (11:59 AM)

@mbklein some of the things i’d say it provides:

  • CRUD and auth following W3C specs (LDP and WebAC)

  • versioning following Memento

  • fixity checking (transmission and on-demand)

esmé cowles (12:02 PM)

@mbklein there’s also various tooling:

  • import/export tool

  • camel messaging

  • API-X

    samvera shops don’t typically use these last two, but we could

Jonathan Rochkind (12:02 PM)

@mbklein when I suggested in a blog post that was the reason people were using fedora, @barmintor got really really mad at me. But that remains my impression, personally. It is not everyones.

I am still interested in hearing a clear case for both fedora and RDF from people who have one. I still haven’t found one myself. I understand not everyone has time to write such things, but I’m still not grasping it myself.

@escowles there’s an import/export tool? That works?  I should look into that. We still haven’t figured out how to copy things from one fedora to another, say from prod to staging.  This would potentially help us do that?

Michael Klein (12:06 PM)

My impression probably comes from the fact that:1) We don’t currently have any non-Samvera Fedora use cases2) We don’t use WebAC or versioning (right now) "our Samvera apps mediate access and connect with full rights3) We don’t use any of the messaging features

esmé cowles (12:06 PM)

@jrochkind there is (https://github.com/fcrepo4-labs/fcrepo-import-export) — there’s one big feature missing (importing versions), but there’s a PR for that

Jonathan Rochkind (12:06 PM)

it works except for that? Cool, I’ll check it out. I hadn’t found taht before, perhaps becuase it’s in "labs" .

@mbklein additionally, the fact that as far as I can tell many institutions doing samvera end up doing one-fedora-per-samvera-app, when I would have expected one institutionally-wide fcrepo to fill "institutional repository" needs. If it’s really just a single-app-specific-data-store"¦. maybe some people have business cases that fcrepo fills, but I don’t think I do.

esmé cowles (12:09 PM)

@mbklein yep, most samvera apps are using ruby/rails alternatives to the fedora auth, messaging, etc.

Michael Klein (12:10 PM)

@jrochkind We at least are using a shared Fedora. 🙂

But it’s still segmented by base path, so there’s not a sense that all our "stuff" just lives happily in one big Fedora "community" to be accessed by whatever apps can make use of them.

Jonathan Rochkind (12:11 PM)

I’d guess you have pretty different data models for your different apps. I think that is the big magic bullet misperception of RDF/linked data, that if it’s just "linked data" then it’s inter-operable. (edited)

Trey Pendragon (12:16 PM)

So, my opinion is basically this:

  1. Modeshape FCRepo doesn’t perform well enough to do what we want. This is known, it’s being worked on.

  2. Fedora is a GOOD idea. It doesn’t just do fixity. It doesn’t just do messaging. It’s middleware API for what your repository is probably doing.

  3. Having an airgap to associate binaries with metadata turns out to be pretty important. I know from experience that it’s super easy for that link to be separated without an API enforcing it, resulting in metadata without full resolution images. It’s not a preservation solution, but it’s an interface to count on. I’m super concerned about it with Valkyrie, but I think having the flexibility is important too.

  4. If you think of Fedora as a service for a repository, a middleware layer, and not as a postgres replacement, it makes a lot of sense. ESPECIALLY if you don’t have, or don’t want, to do everything in one huge front-end. The moment we at Princeton had to synchronize our staff management system with a front-end (spotlight), we ended up running into most of the same problems. How do we handle authorization, even if they’re both authenticated the same way? How do we tell one system that the data has changed? How do we ensure the data we’re pulling down is the data we asked for? All these questions are answered by the middleware layer.

  5. I think RDF is in the way. It’s a performance issue, it’s a training issue, and I think it’s a scoping issue. We would probably be much farther along if we could pass JSON around and then handle RDF as a discussion of preservation and distribution with our metadata experts.

Jonathan Rochkind (12:17 PM)

I don’t understand what this "middleware layer" does.  What are it’s functionality as a "middleware layer" ?

Trey Pendragon (12:19 PM)

Basically everything I said above. Metadata storage and access, association of metadata with binary content, authorization of resources, notifications of resource changes, interfaces to ensure binaries are what you want at both ends of the pipe.

Now, whether that middleware layer is worth the effort it takes to develop and maintain it is another question I don’t have an answer for.

I think a lot of us believe the answer is yes.

Fortunately whatever our opinion, Valkyrie will let us work on the parts that shouldn’t have to care, together.

Mark Matienzo (12:21 PM)

@tpendragon++ # well said

Jonathan Rochkind (12:27 PM)

I don’t really get it myself, but so it goes!

I guess I don’t get why you want a middleware layer to do that. But I guess the answer is when you need multiple front-ends, especially if implemented on multiple platforms, sharing that business logic? ok.

esmé cowles (12:29 PM)

@jrochkind yes — coordinating between multiple apps is definitely what middleware is supposed to help with

Jonathan Rochkind (12:29 PM)

ok, thanks. i understand better now. I wonder what percentage of samvera users have that need though.

and I think there’s maybe agreement that Fedora isn’t actually serving that need very well, at least as we are using it?

esmé cowles (12:30 PM)

yes, i think it’s fair to say that fedora provides a bunch of middleware functionality that samvera is not using

Jonathan Rochkind (12:31 PM)

good to know. Do you have a sense of historically why we aren’t using it?

If we aren’t using it, that would possibly seem to suggest we don’t actually need it? "we" , I don’t know what that means either, haha.

esmé cowles (12:31 PM)

part of it is that a lot of the fedora functionality is new-ish, and samvera found other (ruby) alternatives when it needed functionality before the fedora option was available

esmé cowles (12:33 PM)

the other part is that samvera developers would rather code in ruby, so we’ve preferred ruby community stuff over fedora options

Peak Armintor (12:35 PM)

if samvera is your only fedora app, and you mostly participate in the samvera community, writing your own software, and you don’t need to write other integrations around fedora, sure: maybe you don’t need the seam of a network api

Jonathan Rochkind (12:35 PM)

it’s interesting to hear @tpendragon suggest that he does believe a middleware layer is worthwhile, but does not think RDF is serving us well. Does that mean you think RDF-based fedora is not ideal as a middleware layer, @tpendragon ?

@barmintor it would be interesting to see how much of the samvera community is described by that description.

Peak Armintor (12:36 PM)

@jrochkind none of the ones that use sufia or its successors

Jonathan Rochkind (12:36 PM)

eh? I don’t get it.

Peak Armintor (12:36 PM)

no comment.

Jonathan Rochkind (12:37 PM)

We use sufia. Samvera is our only fedora app. We mostly participate in the samvera community. We don’t need to write other integrations around fedora. I guess "writing our own software" is the one we don’t qualify for?  So it’s just the fact that we use sufia means that we need the seam of a network api?

esmé cowles (12:37 PM)

i don’t think anyone believes fedora is an ideal middleware layer — i think the API spec’d fedora will be a better middleware layer, though

Jonathan Rochkind (12:37 PM)

better than what?

esmé cowles (12:37 PM)

better than fedora 4.x

Peak Armintor (12:38 PM)

@mbklein we’re going through a fairly agonizing situation right now where multiple applications query the same db backend (like, the same tables)

Michael Klein (12:38 PM)

My assumption/understanding is that Sufia/Hyrax uses Fedora in a way that doesn’t make use of some of Fedora’s features, while using others in a very object-model-specific way that doesn’t leave that data well positioned to be used by non-Sufyrax-based applications.

esmé cowles (12:38 PM)

> Sufyrax™️

Peak Armintor (12:38 PM)

@mbklein by "we" I mean my place of work

Michael Klein (12:39 PM)

Understood

Peak Armintor (12:39 PM)

@mbklein likewise, agonies around places where multiple apps directly query the same Solr core

@mbklein all of those are situations where an API would save us a ton of work, and keep data more current, etc.

@mbklein the question then turns to the economies of effort around maintaining the implementation of said API

@mbklein those economies take on a still different character when the implementation is outsources to a commonly-developed project

Jonathan Rochkind (12:41 PM)

"An API"  — at the fedora level? I mean, fedora already has an API, so you mean a better API?

Michael Klein (12:42 PM)

These are all super-reasonable answers to the questions I asked. And useful, because "and this goes to the heart of why I asked "I am not sure those are the reasons my higher-level, non-developer stakeholders think we’re using Fedora.

Peak Armintor (12:42 PM)

@mbklein there is also a characteristic of using a common backend API that I tried (and apparently failed, TBH) to talk about at HyConn last year

@mbklein which is: the API is an embodiment of shared community practice

Mike Giarlo (12:43 PM)

@mbklein "because preservation?"

Michael Klein (12:44 PM)

I think there’s a sense, coming from the original FEDORA grant and versions 1.x-3.x of the software, that Fedora is a Preservation Vault that immediately makes the things we put into it more"¦ something.

Mike Giarlo (12:44 PM)

Right. (And wrong.)

esmé cowles (12:44 PM)

@mbklein absolutely — there is definitely magical thinking around fedora and preservation

Mike Giarlo (12:45 PM)

the Fedora Leadership group attempted to shed some light on this ☝️

esmé cowles (12:45 PM)

preservation is a human activity, not something any software can do for you

Mike Giarlo (12:45 PM)

(There is magical thinking about all sorts of technologies, especially among folks that do not have a deep understanding of said technologies.)

Jonathan Rochkind (12:45 PM)

"I am not sure those are the reasons my higher-level, non-developer stakeholders think we’re using Fedora." OH MY YES.

Magical thinking, and also "We’re using it because everyone else is using it full stop" .

Peak Armintor (12:46 PM)

@mbklein I would take a fairly bold stance about what Fedora is- it’s the way you use the backend. In the case of v4, Fedora is the way the Fedora community is using Modeshape. They may or may not be using it to support the practice of preservation.

Justin Coyne (12:46 PM)

So, Fedora is used because it can support some preservation related activities.   But it does not provide a full solution, right?

hackmaster.a (12:46 PM)

Well it’s not entirely magical thinking. There were marketing efforts.

Peak Armintor (12:47 PM)

@mbklein the API spec effort is an attempt to broaden the possibilities of backend.

@mbklein again, that’s a very editorial view.

Michael Klein (12:47 PM)

Good to know.

Mike Giarlo (12:48 PM)

@jcoyne you mean, "provide a full [preservation] solution?" (edited)

esmé cowles (12:48 PM)

@jcoyne no software is a complete solution: you need human judgement, understanding of your preservation requirements, tradeoffs against resources, etc., etc.

Mike Giarlo (12:48 PM)

The answer to that question depends entirely on your institution’s philosophy around digital preservation, IMO.

And those are quite divergent.

Jonathan Rochkind (12:49 PM)

"it’s the way you use the backend. In the case of v4, Fedora is the way the Fedora community is using Modeshape. "  Then the question is just, why are we using modeshape, no?

Mike Giarlo (12:49 PM)

But from the Fedora perspective, I don’t think it is wise, or has ever been wise, to market it as a full preservation solution. Rather, we should just say "here’s what we do" and leave it to folks to decide if that is sufficient for their preservation needs.

esmé cowles (12:49 PM)

"preservation-enabling functionality" is the term, i believe

hackmaster.a (12:49 PM)

I recently tried out Cantaloupe, which provides extension points in ruby. I looked at the code, and it had been written to allow future development for plugging in in other languages, also. I haven’t looked at, e.g. API-x, etc, but if Fedora is providing REST APIs then it seems like there’s already awareness that people may not want to write java. does API-x require java?

the other part is that samvera developers would rather code in ruby, so we’ve preferred ruby community stuff over fedora options

— esmé cowles
Michael Klein (12:50 PM)

I also admit that I am completely unaware of whatever preservation-enabling functionality is buried within fcrepo 4.x’s configuration files, and how to use it.

Jonathan Rochkind (12:50 PM)

I know of lots of administrators and managers that think fedora is a  preservation solution and that’s why to use it. (And don’t know to think the difference between a "full" preservation solution and a "not full" one. Admins think at high levels). I think they have received marketting to that effect, but maybe they just misunderstood the marketting.

esmé cowles (12:50 PM)

@hackmaster.a i’m not sure — i think it might support other languages

Mike Giarlo (12:50 PM)

@mbklein here’s a high-level view of the preservation-enabling functionality: http://fedorarepository.org/fedora-and-digital-preservation

Jonathan Rochkind (12:51 PM)

@mbklein I’m not even sure what "preservation-enabling functionality" is. When I ask, the only answer I get is fixity checksum recording. Which is such a trivial feature to implement in any system, no?

Mike Giarlo (12:51 PM)

@jrochkind ☝️

esmé cowles (12:51 PM)

@hackmaster.a at the very least, it provides the ability to call out to external services, which could of course be in other languages/platforms/etc.

Michael Klein (12:51 PM)

The link @mjgiarlo shared above gets into some of that.

Jonathan Rochkind (12:51 PM)

@mjgiarlo yeah, okay. fixity and versioning.

hackmaster.a (12:52 PM)

I’ve also seen at least one samvera person (don’t remember who) say that eventual consistency is not acceptable to them, and so they would not be happy if samvera code started using fedora’s messaging.

Jonathan Rochkind (12:53 PM)

most of those features in @mjgiarlo’s link do not seem that exciting or difficult. Versioning is a difficult one. Which fedora maybe doing well, but samvera’s integration with fedora I don’t think is. We’re kind of scared to use fedora versioning because we don’t know what it will break in our app.

The import/export bullet point, I believe has not been succesful. Has anyone succesfully achieved those objectives aimed at? Integration with preservation systems external to fedora, avoidance of platform lock-in? I guess the answer can always be "we’re still working on it" , but at some point you wonder about cost-benefit of that continued work. (edited)

Trey Pendragon (12:55 PM)

@hackmaster.a My understanding, and it’s been corroborated, of API-X is that it’s less of a feature and more of a pattern. "If you want to add a microservice for doing something specific with your repository, host that service in the same domain as Fedora using some sort of reverse proxy mechanism"

Peak Armintor (12:55 PM)

@jrochkind if it seems easy to do well, go for it. Historically doing those things in a shareably generic way was something like 200k lines of Java, and nobody could support it.

Trey Pendragon (12:55 PM)

@hackmaster.a "And here’s a community of people doing that."

Peak Armintor (12:56 PM)

@jrochkind I would also invite you to look at this community’s difficulty supporting core libraries

@jrochkind when you’re done, I hope we’ve got an API spec that we can ask you to implement

Jonathan Rochkind (12:57 PM)

@barmintor It does not seem easy to do. I am saying that something that is not easy to do and that has not been done probaby does not belong on a list of "what preservation-related features fedora provides" . (edited)

I guess if it’s a wishlist of what preservation-related features we’d like fedora to provide that’s differnet.

That list is titled "Fedora provides the following features in support of preservation" .

Trey Pendragon (12:59 PM)

I want to point out, that marketing has been SO GOOD that we’re now talking about Fedora as a preservation tool again. (edited)

Peak Armintor (1:01 PM)

@jrochkind @mbklein I’m sorry. I’m going to excuse myself here - this is ground I feel like is pretty well-trod, and I get all het up because much of what sucks to you now is in some way my fault. It’s personally draining to constantly relitigate, but I also have a hard time not engaging. I don’t understand why I got tagged into "barmintor got really really mad at me", and I should have stayed away.

Jonathan Rochkind (1:01 PM)

I have no idea what your role was in any of it, so didn’t mean to be saying any of it was your fault.

Michael Klein (1:02 PM)

FWIW, I don’t think it sucks, and I certainly don’t think it’s your fault.

Jonathan Rochkind (1:03 PM)

This is an interesting point on that list @mjgiarlo linked: "Deposited files are stored on the filesystem in a predictable location based on their checksums, which allows operating system-level access to files independent of Fedora"  Interestingly, we considered accessing bytestreams on file system directly without going through fcrepo API, but decided we couldn’t tell if it was risky and would break in future fcrepo versions, and I think others told us we were probably right. (edited)

It makes me think that list is marketting, and not reality.

Michael Klein (1:05 PM)

I just want to make sure that when I go to sell my stakeholders on a thing (either "it’ll work for our needs" or "it won’t work for our needs" ), that the things I say line up with reality. I’ve had a hard time knowing what Fedora’s promises and current reality and promised reality are, and I’m perfectly willing to believe/accept that it’s my own lack of involvement in non-Samvera-related Fedora discussions that led to that gap in my understanding.

@hackmaster.a: I’ll admit that I’m one who is wary of "eventual consistency." But only because I need to be able to write a thing and read it back and know that it’s what it should be. Intermediate layers can help with that by putting the fast bits in front and letting the slow bits catch up. But I’m not sure Fedora messaging is the conduit to make that happen.

hackmaster.a (1:11 PM)

@mbklein the comment stuck with me at the time, maybe it was yours. I was curious about the actual scenario in which you need to write a thing and read it back immediately.

Jonathan Rochkind (1:12 PM)

having an admin screen where you press save and then see something different than you just saved would prob be a problematic UX.  There are ways around this in an "eventual consistency' system, but there are also ways to not get around it. 🙂

hackmaster.a (1:13 PM)

yeah I guess you would need a more immediate storage layer.

Michael Klein (1:13 PM)

That’s exactly it. Especially in self-deposit. Having someone enter data and upload a file, only to see something different from what they submitted reflected in the UI, is disconcerting.

hackmaster.a (1:14 PM)

Ah, right, the self-deposit. got it.

Jonathan Rochkind (1:14 PM)

I don’t think that’s an acceptable UX anywhere.

hackmaster.a (1:15 PM)

No, I agree with you @jrochkind. but definitely the end-user scenario of self-deposit had escaped my mind.

Mike Giarlo (1:19 PM)

@jrochkind The people who worked on that list were a mix of dev-y types and manager-y types.

So I don’t think it’s "marketing and not reality"

FWIW, Fedora’s use of filesystems has been quite stable, @jrochkind. It did change considerably between 3 and 4, but I’m not aware of major changes between Fedora 1 and 3. Or between Fedora 4.0.0 and 4.7.4.

@jrochkind We built a lot off of "accessing bytestreams on file system directly without going through fcrepo API" during my time at Rutgers, FWIW, and that stuff is probably still running 13 years later.

carolyn caizzi (4:48 PM)

And I want to just say that as a manager, there is no way I think Fedora is the magical solution to  the entirety of digital preservation; but I believe in it’s ability to be part of the mix (on the tech side) and there has been a huge investment in it and there is from what I can tell the openness and willingness to change.  I think there are some assumptions going on here about managers having reductionist thoughts—​exactly the opposite.  No one in the field of digital preservation, especially managers and the digital preservation librarians I know, think there is one thing that DOES preservation—​people, data storage, policies, software, etc. is just scratching the surface.  Sorry, I just had to insert myself here (as @mbklein’s manager).

I know of lots of administrators and managers that think fedora is a  preservation solution and that’s why to use it. (And don’t know to think the difference between a "full" preservation solution and a "not full" one. Admins think at high levels). I think they have received marketting to that effect, but maybe they just misunderstood the marketting.

— Jonathan Rochkind
Mike Giarlo (4:54 PM)

@ccaizzi++

Jonathan Rochkind (4:59 PM)

you’re right, sorry for seeming to bash managers @ccaizzi!

I do think we need to consider the "sunk cost fallacy' when thinking of things we’ve made a huge investment in. Sometimes even though you’ve made a huge investment, it’s appropriate to switch things up.

In the library environments I have worked in and experienced (which may not be typical), the decision-makers-with-power have seemed to me to be intolerant of switching things up, of doing things differently than "everyone else" , or of admitting something didn’t work out as expected and a new tact shoudl be tried.

carolyn caizzi (5:00 PM)

We like change over here at NUL

And I think I put some 🔥 under @mbklein today. 🙂

Michael Klein (5:02 PM)

@ccaizzi: I didn’t meant to imply that you (or anyone else) sees Fedora as a set-it-and-forget-it black box preservation solution. I just think sometimes we talk of it as if it provides preservation features out of the box that either it doesn’t, or that Samvera’s stack doesn’t take advantage of, or that we haven’t switched on.

Jonathan Rochkind (5:02 PM)

"admitting something didn’t work out as expected and a new approach should be tried" is something I think is in the corner of "high importance" and "low willingness to do" on the quad graph.  In my experience and interpretation of that experience, that may not be others.

steve van tuyl (5:02 PM)

but 'switching things up' can be enormously resource intensive. I’m not defending a head in the sand approach to making decisions, but it can be really hard to justify allocating those resources

Michael Klein (5:02 PM)

I’m always 🔥. I just don’t always let it out. 🙂

carolyn caizzi (5:02 PM)

I heard that, I was responding to a specific message. 🙂

@ccaizzi: I didn’t meant to imply that you (or anyone else) sees Fedora as a set-it-and-forget-it black box preservation solution. I just think sometimes we talk of it as if it provides preservation features out of the box that either it doesn’t, or that Samvera’s stack doesn’t take advantage of, or that we haven’t switched on.

— Michael Klein
Jonathan Rochkind (5:03 PM)

@vantuyls absolutely!  But staying with what you’ve got can be enormously resource intensive too. That’s the "sunk cost fallacy" in a nutshell.

steve van tuyl (5:03 PM)

to some extent it’s not a fallacy. i mean, both of those decisions have costs.

Jonathan Rochkind (5:04 PM)

Well, in the cases where staying with what you’ve got is more resource intensive than switching it up, that’s where it’s a fallacy. Or being unwilling to consider when that may be true. Of course, it’s still an educated guessing game, you never know for sure. (edited)

Jonathan Rochkind (5:04 PM)

But responding to "you never know for sure" with an automatic "so we’ll stay with what we’ve got without trying to analyze the cost" is, IMO, the sunk cost fallacy.

steve van tuyl (5:04 PM)

@jrochkind absolutely.

Jonathan Rochkind (5:05 PM)

"Reasoning that further investment is warranted on the fact that the resources already invested will be lost otherwise, not taking into consideration the overall losses involved in the further investment."  https://www.logicallyfallacious.com/tools/lp/Bo/LogicalFallacies/173/Sunk-Cost-Fallacy (edited)

carolyn caizzi (5:05 PM)

Yes, if it’s going to cost, we need data to make informed decisions.  The best we can do is make an informed decision.

but 'switching things up' can be enormously resource intensive. I’m not defending a head in the sand approach to making decisions, but it can be really hard to justify allocating those resources

— steve van tuyl
Jonathan Rochkind (5:05 PM)

I honestly don’t know how we create data on the cost of developing things. Any examples of such data?

steve van tuyl (5:06 PM)

one thing i’ve found challenging on this front is explaining to management (and coworkers not in my department) why making an apparently 'expensive' decision (e.g. to move to a new system) is less costly than staying with the old system. it can be very hard to see what the cost of maintenance of an old thing is, for some folks.

Jonathan Rochkind (5:06 PM)

I mean, we can have data on how many staff hours are spent now. But data on the staff-hour cost of one decision vs another going forward? I don’t know how to do that. (edited)

steve van tuyl (5:07 PM)

@jrochkind the only example i know if is in the area of Ecosystem Services - where ecologist/economists assign value to the functions of an ecosystem. e.g. my mangrove swamp filters the water and the same amount of water filtration woudl cost $X if we were to build a filtration system.

fyi. i’d be willing to work with anyone interested to write up some cost estimates for things like participating in FOSS communities etc.

Jonathan Rochkind (5:08 PM)

the problem is we don’t really know how many staff hours it will cost to, say, move to valkyrie vs sticking with current stack.  Experienced professionals can make estimates, but I’m not sure I’d call that "data' in the sense of something empirically measured. (edited)

I don’t think the valkyrie effort is motivated by "data' exactly. Correct me if i’m wrong.

steve van tuyl (5:10 PM)

isn’t the valkyrie effort motivated by a need for better performance? that performance is measured in time.

carolyn caizzi (5:10 PM)

Sorry, I think there was way more to my comment and our situation here then I can share in chat.

the problem is we don’t really know how many staff hours it will cost to, say, move to valkyrie vs sticking with current stack.  Experienced professionals can make estimates, but I’m not sure I’d call that "data' in the sense of something empirically measured.

— Jonathan Rochkind
Jonathan Rochkind (5:10 PM)

yeah, risk is an important concept here too.

Mike Giarlo (5:23 PM)

@vantuyls I’d say performance was a catalyst in getting Valkyrie created, but the motivation for Valkyrie or something like it has been around for at least a couple years. See also @jpstroop’s and my talks about the gobstopper and the need for more "air gaps" in the stack.

@tpendragon can correct me if I’ve been inaccurate.

steve van tuyl (5:26 PM)

every motivation needs a catalyst in order for the thing to get done

Jonathan Rochkind (5:26 PM)

one hopes for a catalyst other than horrible crushing failure, haha.

Jon Stroop (5:50 PM)

@vantuyls I’d add that it would also be nice if we could simplify the stack, thus making onboarding easier and less maddening for new devs (edited)

Trey Pendragon (7:33 PM)

One purpose of the Data Mapper group was to determine if switching the pattern we use would help with the time we spend developing our applications. It’s been good so far, I think, but we haven’t discussed and finalized our thoughts.

Figgy has been a great exercise to determine how long it takes and how reliable starting over is, too. I don’t think it’s impossible to quantify, that’s part of the reason for spikes yeah?

Anyways, yes, Valkyrie started as a way to figure out performance, but its also a response to years of talking (like this) about how AF and the AR pattern doesn’t fit how we store and access items.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment