I am writing a little program and I need a function that will convert a list of some values into the list of overlapping windows, for example, for window size 4: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10,......] -> [[1, 2, 3, 4], [3, 4, 5, 6], [5, 6, 7, 8], [7, 8, 9, 10],......]
Windows must overlap by one half and input list is always infinite.
My problem is that my implementations don't want to work in constant space. First I made this function as follows (i is window length):
mkWindows :: Int -> [a] -> [[a]] mkWindows i a = (take i a):(mkWindows i (drop (div i 2) a))
main = print (take 5 (drop 1000000 (mkWindows 4 [1..])))
This gives me stack overflow. I've rewritten it without "drop":
-- Here first argument counts one half of window size (to drop it) -- and the second keeps contents next window. mkWindows_ :: Int -> [a] -> Int -> [a] -> [[a]] mkWindows_ 0 t l x = t : mkWindows_ (div l 2) (take l x) l x mkWindows_ d t l (x:xs) = mkWindows_ (d-1) t l xs
mkWindows l x = mkWindows_ (div l 2) (take l x) l x
But the result is the same.
Isn't my function tail-recursive? Is it possible to fix the problem? Thanks.
Paul Graphov <grap...@gmail.com> wrote: > Hello haskellers,
> I am writing a little program and I need a function that will convert > a list of some values into the list of overlapping windows, for > example, for window size 4: > [1, 2, 3, 4, 5, 6, 7, 8, 9, 10,......] -> [[1, 2, 3, 4], [3, 4, 5, 6], > [5, 6, 7, 8], [7, 8, 9, 10],......]
Your way of thinking is quite complicated. Reconsider what you want. You want a function, that continuously drops a certain number of elements (in this case 2) from a list and gives an initial portion of the remaining list.
windows n d = map (take n) . takeWhile (not . null) . iterate (drop d)
All functions used here are suitable for infinite lists, so the resulting composition is as well.
"iterate (drop d)" produces a list of infinite tails.
"map (take n)" takes their prefixes.
But what "takeWhile(not . null)" does in case of infinite input list? There should be no null tails.
If I try to remove "takeWhile(not . null)" the function works the same way if i need only a few windows but if i try to get millionth one it causes stack overflow again. The original version works fine. How that takeWhile helps?
Paul Graphov <grap...@gmail.com> wrote: > Wow, I's much prettier code than my...
> But I don't understand it a bit...
> "iterate (drop d)" produces a list of infinite tails.
> "map (take n)" takes their prefixes.
> But what "takeWhile(not . null)" does in case of infinite input list? > There should be no null tails.
The 'iterate' function never stops iterating. Its result is, by definition, an infinite list. This is problematic here, because without the 'takeWhile', 'drop' will happily produce the empty list out of the empty list infinitely, so you will get an infinite result list all the time, even if the argument is a finite list.
> If I try to remove "takeWhile(not . null)" the function works the same > way if i need only a few windows but if i try to get millionth one it > causes stack overflow again. The original version works fine. > How that takeWhile helps?
Honestly I have no idea. This may be a compiler or interpreter issue. It works without problems for me (GHC 6.8.3), even in interpreter mode.
> Honestly I have no idea. This may be a compiler or interpreter issue. > It works without problems for me (GHC 6.8.3), even in interpreter mode.
It seems that i got it. It's due to too much lazyness...
"map (take n) . iterate (drop d)" will produce list like this:
[(take n x), (take n (drop d x)), (take n (drop d (drop d x))), (take n (drop d (drop d (drop d x)))),.....] And all these chunks are not evaluated until I actually use them. So it causes stack overvlow when i try to evaluate million of nested drops.
And "takeWhile (not . null)" actually pattern matches them against "[]" so they are forced to be calculated.
Paul Graphov <grap...@gmail.com> wrote: > Ertugrul Söylemez <e...@ertes.de> wrote: >> Honestly I have no idea. This may be a compiler or interpreter issue. >> It works without problems for me (GHC 6.8.3), even in interpreter mode. > It seems that i got it. It's due to too much lazyness...
That's the usual reason :-) But see below.
> "map (take n) . iterate (drop d)" will produce list like this: > [(take n x), (take n (drop d x)), (take n (drop d (drop d x))), (take > n (drop d (drop d (drop d x)))),.....]
And the drop's will be shared, and the list won't actually be produced until you really use it.
So this isn't a problem unless you first force the whole list without actually evaluating its elements. And even then it's likely not a problem with the nested drop's, but the fact that you've forced a whole list of thunks.
> And all these chunks are not evaluated until I actually use them. So > it causes stack overvlow when i try to evaluate million of nested > drops.
No, because the drop's are shared. Or are you for some reason trying to evaluate the last element first when using the result (see below)?
> And "takeWhile (not . null)" actually pattern matches them against > "[]" so they are forced to be calculated.
But that evaluation will stop after it encounters the first cons, so it will leave a chunk like "y : (take (n-1) x (drop (d-1) ...))". Hence that cannot be the reason.
> It's strange that it can work different ways...
How are you actually *using* the "window" list? Are you feeding it into some program you have not shown us? If yes, that would explain why it works on Ertugrul's computer, but not on yours.
And if you're doing some sort of interpolation or integration, then I'd suspect that *this* is the place where the calculation should be made more strict. That smells like the usual "calculate one value out of many" situation, where strictness is preferable to lazyness.
So it's likely not the value-producing function that is at fault (that works fine with lazyness), but the value-consuming function.
> No, because the drop's are shared. Or are you for some reason trying > to evaluate the last element first when using the result (see below)?
Yes, I just used "!! 1000000" on it while testing. If I print all million elements no stack overflow occurs. It seems that this was the reason that programs behaved different ways.
> > And "takeWhile (not . null)" actually pattern matches them against > > "[]" so they are forced to be calculated.
> But that evaluation will stop after it encounters the first cons, > so it will leave a chunk like "y : (take (n-1) x (drop (d-1) ...))". > Hence that cannot be the reason.
Hmmm... It seems to me to get first cons one must remove ALL drops? You cannot check if (drop d (drop d(..............)) is nul until you unwind all drops... Thanks.
Paul Graphov <grap...@gmail.com> wrote: >> No, because the drop's are shared. Or are you for some reason trying >> to evaluate the last element first when using the result (see below)? > Yes, I just used "!! 1000000" on it while testing. > If I print all million elements no stack overflow occurs. > It seems that this was the reason that programs behaved different > ways.
Ok. Problem found :-)
>> > And "takeWhile (not . null)" actually pattern matches them against >> > "[]" so they are forced to be calculated. >> But that evaluation will stop after it encounters the first cons, >> so it will leave a chunk like "y : (take (n-1) x (drop (d-1) ...))". >> Hence that cannot be the reason. > Hmmm... It seems to me to get first cons one must remove ALL drops? > You cannot check if (drop d (drop d(..............)) is nul until you > unwind all drops...
Yes, you're right -- I was probably still thinking of take, and just mentally assuming drop would work similarly without having a closer look. But of course drop only produces output after it has completely unwound.
Which explains the other part.
But I'd still recommend, if you haven't done it already, to go over the actual calculation you're doing (whatever that is) and make sure that this is evaluating everything in the "right" order -- even if you don't have a stack overflow, you might use up memory unnecessesarily otherwise.