A File Renamer

I guess this thread has spawned another issue. Let me close this and say I
will look for some other project to work on. Honestly speaking I also do not
support piracy and the file was just a dummy one for simulation purpose and
not an actual movie file( Have changed the subject :wink: also )

I would appreciate if we don't indulge in this topic of piracy or theft of
movies as this is a Ruby forum.

Mayank

ยทยทยท

On Thu, Jun 30, 2011 at 6:19 AM, Chad Perrin <code@apotheon.net> wrote:

On Thu, Jun 30, 2011 at 07:35:03AM +0900, Sam Duncan wrote:
> Before this goes completely off the rails [sic], my points are;

No, wait -- this could be instructive for later threads:

>
> *) Call the thread "A File Renamer"

. . . or you could have pretended that's what it's called, and none of
this would have happened.

>
> *) Make up an example filename

. . . or you could consider that the person in question might have used
the same software to create a copy for personal use as whoever it was
that posted an illegal copy on the Internet, and not jump to conclusions,
thus potentially sparking an off-topic flamewar.

>
> *) Profit

. . . or you could have contacted the person *personally* to tell him/her
your opinion of "stealing" (which isn't even the correct term) rather
than subject the rest of us to your self-righteousness.

I implore anyone else who considers taking the same approach as Sam
Duncan to consider my list of alternatives if any of you get the urge in
the future.

With that, I'm finished. Feel free to offer more declarations of moral
turpitude from on high if you like, Sam.

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

--
Mayank Kohaley

I guess this thread has spawned another issue. Let me close this and
say I will look for some other project to work on.

NOOOOO. Make it work.

I would appreciate if we don't indulge in this topic of piracy or
theft of movies as this is a Ruby forum.

Yes, I've seen the adverts and now I know that Piracy Funds Terrorism.
Clearly the IRA and Hamas really love sharing Hollywood films and pop
music with bittorrent.

To be frank I'm appalled that the censors say this shit. It cheapens
an important problem. I think you should keep downloading things just
to piss them off. Whayyy.

Is there a pattern to the file names you are working with? The key is
to find a pattern and write code to work with that pattern.

If the pattern requires a lot of heuristics, then it's going to be
more difficult to do, but not impossible.

For a example, with your previous example, how would you know when the
move name ends and the meta data begins? It might be possible to keep
an array (or a hash) of the meta data, _split_ the file name on some
delimiter (such as a dot) and _filter_ out items that are contained in
the array (or hash).

ยทยทยท

On Thu, Jun 30, 2011 at 1:48 AM, Mayank Kohaley <mayank.kohaley@gmail.com> wrote:

[snip]

And read the API documentation that is the key how to deal with a new
language, that is my opinion.

ยทยทยท

2011/6/30 Jeremy Heiler <jeremyheiler@gmail.com>

On Thu, Jun 30, 2011 at 1:48 AM, Mayank Kohaley > <mayank.kohaley@gmail.com> wrote:
> [snip]

Is there a pattern to the file names you are working with? The key is
to find a pattern and write code to work with that pattern.

If the pattern requires a lot of heuristics, then it's going to be
more difficult to do, but not impossible.

For a example, with your previous example, how would you know when the
move name ends and the meta data begins? It might be possible to keep
an array (or a hash) of the meta data, _split_ the file name on some
delimiter (such as a dot) and _filter_ out items that are contained in
the array (or hash).

Is there a pattern to the file names you are working with? The key is
to find a pattern and write code to work with that pattern.

You know the video file itself probably has meta data.

Inspecting that would be a harder than just regex matching on the
filename but the program is going to be much more robust in the face of
crappy input.

Agreed.

Also, "interact" with the API with an IRB session open right next to
it. The best way to learn something long-term is to understand it and
then apply it.

ยทยทยท

On Thu, Jun 30, 2011 at 9:41 AM, coolesting <coolesting@gmail.com> wrote:

And read the API documentation that is the key how to deal with a new
language, that is my opinion.

Do you have suggestions for Ruby libraries suitable to the task of
reading video file metadata?

ยทยทยท

On Fri, Jul 01, 2011 at 12:55:37AM +0900, Johnny Morrice wrote:

> Is there a pattern to the file names you are working with? The key is
> to find a pattern and write code to work with that pattern.

You know the video file itself probably has meta data.

Inspecting that would be a harder than just regex matching on the
filename but the program is going to be much more robust in the face of
crappy input.

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

But requires a different set of assumptions: the files are videos (it is
called a "File" renamer, no longer a "Movie" renamer), and they're all the
same format or you have a tool that will let you access this format.

I did something like this with my music collection, CD's that I'd ripped
were MP3s and could be renamed nicely, but the ones I'd bought off iTunes
were m4a's and didn't have this info. Some of the music I'd ripped a long
time ago didn't have all the same tags filled out, etc. In the end I had a
tool that renamed maybe 90-95% of my music and the rest I had to do by hand.

ยทยทยท

On Thu, Jun 30, 2011 at 10:55 AM, Johnny Morrice <spoon@killersmurf.com>wrote:

> Is there a pattern to the file names you are working with? The key is
> to find a pattern and write code to work with that pattern.

You know the video file itself probably has meta data.

Inspecting that would be a harder than just regex matching on the
filename but the program is going to be much more robust in the face of
crappy input.

Yes, the ruby team make this IRB that is very convenience. I never have
played it as like this way to learning a programming language.

ยทยทยท

2011/6/30 Jeremy Heiler <jeremyheiler@gmail.com>

On Thu, Jun 30, 2011 at 9:41 AM, coolesting <coolesting@gmail.com> wrote:
> And read the API documentation that is the key how to deal with a new
> language, that is my opinion.
>

Agreed.

Also, "interact" with the API with an IRB session open right next to
it. The best way to learn something long-term is to understand it and
then apply it.

You should be aware that meta data can't be trusted. Not only do people not
know how to use fields properly they will also use different character sets
like stuffing in Unicode when the tag claims to be ASCII.

Not to mention using online databases can be a real exercise in frustration.
Again people seem incapable of using fields properly and there are numerous
duplicates for items that shouldn't be there.

I think you're getting dragged down into the implementation of a project
instead fo learning the core concepts of OO design. I would avoid the whole
mess and do something more focused like working through the Ruby Koans:

Here are some other resources for learning OO design and the concepts behind
it:

Berkeley - 61A - The Structure and Interpretation of Computer Programs
(includes iTunes links)
Videos:
http://webcast.berkeley.edu/playlist#c,d,Computer_Science,D7B8D6A4834C14C8
Course resources: CS 61A Home Page
Original videos and the book:
http://groups.csail.mit.edu/mac/classes/6.001/abelson-sussman-lectures/

Stanford - 106B - Programming Abstractions
Videos:
http://171.64.93.201/ClassX/system/users/web/Subject.php?subject=CS106B_LECTURES_SPR2011
Course resources: CS106B Home

ยทยทยท

On Thu, Jun 30, 2011 at 2:50 PM, Josh Cheek <josh.cheek@gmail.com> wrote:

On Thu, Jun 30, 2011 at 10:55 AM, Johnny Morrice <spoon@killersmurf.com > >wrote:

> > Is there a pattern to the file names you are working with? The key is
> > to find a pattern and write code to work with that pattern.
>
> You know the video file itself probably has meta data.
>
> Inspecting that would be a harder than just regex matching on the
> filename but the program is going to be much more robust in the face of
> crappy input.
>
>
But requires a different set of assumptions: the files are videos (it is
called a "File" renamer, no longer a "Movie" renamer), and they're all the
same format or you have a tool that will let you access this format.

I did something like this with my music collection, CD's that I'd ripped
were MP3s and could be renamed nicely, but the ones I'd bought off iTunes
were m4a's and didn't have this info. Some of the music I'd ripped a long
time ago didn't have all the same tags filled out, etc. In the end I had a
tool that renamed maybe 90-95% of my music and the rest I had to do by
hand.

They teach OOP in SICP?

ยทยทยท

On Thu, Jun 30, 2011 at 3:37 PM, Mike Bethany <mikbe.tk@gmail.com> wrote:

Here are some other resources for learning OO design and the concepts
behind
it:

Berkeley - 61A - The Structure and Interpretation of Computer Programs
(includes iTunes links)
Videos:
Webcast and Legacy Course Capture | Research, Teaching, and Learning
Course resources: http://wla.berkeley.edu/~cs61a/sp11/
Original videos and the book:
Structure and Interpretation of Computer Programs, Video Lectures

The file name cannot be trusted any more than the meta data.

To the other points, and those raised by Josh and Chad. Your points are
good and have virtue but this is the way I would do it! This is based
on experience with a pattern approach which suffered from inconsistent
naming.

I don't know what library you would use, never done this, but I
wouldn't repeat my previous mistake.

ยทยทยท

On Fri, 1 Jul 2011 05:37:11 +0900 Mike Bethany <mikbe.tk@gmail.com> wrote:

You should be aware that meta data can't be trusted.

No SICP would be "the concepts behind [Object Oriented programming]" part of
that. For instance abstraction is a major component of OO design, SICP
covers that. When you go through SICP you'll see a lot of ideas you'll
recognize from OO languages and you'll have a better grasp of what's going
on. It's like learning the parts of speech before trying to write the next
great novel.

ยทยทยท

On Thu, Jun 30, 2011 at 4:58 PM, Josh Cheek <josh.cheek@gmail.com> wrote:

On Thu, Jun 30, 2011 at 3:37 PM, Mike Bethany <mikbe.tk@gmail.com> wrote:

> Here are some other resources for learning OO design and the concepts
> behind
> it:
>
> Berkeley - 61A - The Structure and Interpretation of Computer Programs
> (includes iTunes links)
> Videos:
>
Webcast and Legacy Course Capture | Research, Teaching, and Learning
> Course resources: CS 61A Home Page
> Original videos and the book:
> Structure and Interpretation of Computer Programs, Video Lectures
>
>
They teach OOP in SICP?

I totally agree. Not only can people incorrectly name files but they can use
h4x0r l33t15h in the file names thus totally obfuscating the real name. Not
to mention how you will differentiate between the spam parts fo the file
name and the real name and the order in which they come.

I think the problem, while it *sounds* simple, is hugely complex. It's a CS
Master or PhD Thesis level project, not a "new to OO" kind of thing.

ยทยทยท

On Thu, Jun 30, 2011 at 5:15 PM, Johnny Morrice <spoon@killersmurf.com>wrote:

On Fri, 1 Jul 2011 05:37:11 +0900 > Mike Bethany <mikbe.tk@gmail.com> wrote:

> You should be aware that meta data can't be trusted.

The file name cannot be trusted any more than the meta data.

Yeah, but isn't it kind of like learning them in Klingon?

ยทยทยท

On Thu, Jun 30, 2011 at 4:16 PM, Mike Bethany <mikbe.tk@gmail.com> wrote:

It's like learning the parts of speech before trying to write the next
great novel.

How sure are we that filenames provided by *others*, scraped from the
Internet, are even part of the real problem domain for this case? Maybe
the filenames *can* be trusted for the theoretical use case we face.

ยทยทยท

On Fri, Jul 01, 2011 at 06:37:58AM +0900, Mike Bethany wrote:

On Thu, Jun 30, 2011 at 5:15 PM, Johnny Morrice wrote:
>
> The file name cannot be trusted any more than the meta data.

I totally agree. Not only can people incorrectly name files but they
can use h4x0r l33t15h in the file names thus totally obfuscating the
real name. Not to mention how you will differentiate between the spam
parts fo the file name and the real name and the order in which they
come.

I think the problem, while it *sounds* simple, is hugely complex. It's
a CS Master or PhD Thesis level project, not a "new to OO" kind of
thing.

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

LOL, no, no, no... it's more like Esperanto with an Irish brogue.

ยทยทยท

On Thu, Jun 30, 2011 at 5:41 PM, Josh Cheek <josh.cheek@gmail.com> wrote:

On Thu, Jun 30, 2011 at 4:16 PM, Mike Bethany <mikbe.tk@gmail.com> wrote:

> It's like learning the parts of speech before trying to write the next
> great novel.
>
>

Yeah, but isn't it kind of like learning them in Klingon?

Good point. OP would have to answer that question. The scope changes the
problem dramatically but I wonder if it's not reasonable to assume the files
will be named by humans. Once you put humans into the mix you're going to
have problems... all hail our computer overlords!

ยทยทยท

On Thu, Jun 30, 2011 at 6:51 PM, Chad Perrin <code@apotheon.net> wrote:

On Fri, Jul 01, 2011 at 06:37:58AM +0900, Mike Bethany wrote:
> On Thu, Jun 30, 2011 at 5:15 PM, Johnny Morrice wrote:
> >
> > The file name cannot be trusted any more than the meta data.
>
> I totally agree. Not only can people incorrectly name files but they
> can use h4x0r l33t15h in the file names thus totally obfuscating the
> real name. Not to mention how you will differentiate between the spam
> parts fo the file name and the real name and the order in which they
> come.
>
> I think the problem, while it *sounds* simple, is hugely complex. It's
> a CS Master or PhD Thesis level project, not a "new to OO" kind of
> thing.

How sure are we that filenames provided by *others*, scraped from the
Internet, are even part of the real problem domain for this case? Maybe
the filenames *can* be trusted for the theoretical use case we face.

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]