experchange > c

DFS (11-12-19, 10:55 PM)
You clc guys are masters at this stuff.

Goal: Find all the open intervals in an ordered set of numbers

nbrs = [1, 2, 4, 5, 7, 12, 1300, 10000]
opens = [(3, 3), (6, 6), (8, 11), (13, 1299), (1301, 9999)]

I think 'open intervals' is the right terminology.

My simple algorithm:

if nbrs is contiguous then exit
else
for i in range(nbrs)
if nbrs[i+1] - nbrs[i] > 1 then
startinterval = nbrs[i] + 1
if nbrs[i+1] - startinterval >= 1 then
endinterval = nbrs[i+1] - 1

==============================================
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {

int i = 0, x = 0;
int nums[] = {1, 2, 4, 5, 7, 12, 1300, 10000};
//int nums[] = {3,4,5,7,8}; //sequential, for testing
int lenNums = sizeof(nums)/sizeof(nums[0]);
int start = nums[0];
int end = nums[lenNums-1];
char *interval = "";
char *intervals[sizeof(char*) * lenNums];

//check for no open intervals
if((end-start)==(lenNums-1)) {
printf("Number set is sequential from %d to %d\n",start,end);
exit(0);
}

//find open intervals
x = 0;
for(i=0;i<lenNums;i++) {
if(nums[i+1]-nums[i] > 1) {
start = nums[i]+1;
if(nums[i+1]-start >= 1) {
end = nums[i+1]-1;
//end of algorithm

sprintf(interval,"(%d,%d)",start,end);
intervals[x] = interval; //assignment seems to work
printf("%s ",intervals[x]);
x++;

}
}
}

printf("\n\n");

//Why do the following not work, given that the
//assignment and printf() in the loop above seems to?
printf("Interval 1 = %s\n",intervals[0]);
printf("Interval 2 = %s\n",intervals[1]);
printf("Interval 3 = %s\n",intervals[2]);
printf("Interval 4 = %s\n",intervals[3]);
printf("Interval 5 = %s\n",intervals[4]);
printf("\n");
for(i=0;i<x;i++) {
printf("%s ",intervals[i]);
}

printf("\nDone");
exit(0);
}
==============================================
Bart (11-12-19, 11:20 PM)
On 12/11/2019 20:55, DFS wrote:
> You clc guys are masters at this stuff.
> Goal: Find all the open intervals in an ordered set of numbers
> nbrs  = [1, 2, 4, 5, 7, 12, 1300, 10000]
> opens = [(3, 3), (6, 6), (8, 11), (13, 1299), (1301, 9999)]


(If already sorted, then this task is fairly trivial, just a bit fiddly.
For example in my script language:

proc start=
nbrs := [1, 2, 4, 5, 7, 12, 1300, 10000]
println -nbrs # or inot nbrs
end

Output is:

[0,3,6,8..11,13..1299,1301..9999]

Which is most of it done (needs to properly define the lower bound, and
format output as desired). But the approach here involves bita-rrays,
which can use lots of memory for big numbers.)

> I think 'open intervals' is the right terminology.
> My simple algorithm:
> if nbrs is contiguous then exit
> else
>   for i in range(nbrs)
>     if nbrs[i+1] - nbrs[i] > 1 then
>       startinterval =  nbrs[i] + 1
>       if nbrs[i+1] - startinterval >= 1 then
>         endinterval = nbrs[i+1] - 1


>   char *intervals[sizeof(char*) * lenNums];


Use char ** here; see below.

>         sprintf(interval,"(%d,%d)",start,end);
>         intervals[x] = interval;    //assignment seems to work


Same problem as a week or two ago: all intervals[x] point to the same
interval string, ie. the last one when done.

I used this helper function

char* allocstring(char* s){
char *t=malloc(strlen(s)+1);
strcpy(t,s);
return t;
}

then the assignment becomes:

intervals[x] = allocstring(interval);
Ben Bacarisse (11-13-19, 01:30 AM)
DFS <nospam> writes:

> You clc guys are masters at this stuff.
> Goal: Find all the open intervals in an ordered set of numbers
> nbrs = [1, 2, 4, 5, 7, 12, 1300, 10000]
> opens = [(3, 3), (6, 6), (8, 11), (13, 1299), (1301, 9999)]
> I think 'open intervals' is the right terminology.


Not really. The open interval (3, 3) contains no integers where as the
closed interval [3, 3] contains exactly one integer: 3.

void print_gaps(int n, const int *data)
{
if (n > 1) {
if (data[0] + 1 < data[1])
printf(" (%d, %d)", data[0] + 1, data[1] - 1);
print_gaps(n - 1, data + 1);
}
}

To test drive it I used:

int main(int argc, char **argv)
{
if (argc > 1) {
int seq[argc-1];
for (int a = 1; a < argc; a++)
seq[a-1] = atoi(argv[a]);
print_gaps(argc-1, seq);
putchar('\n');
}
}
Barry Schwarz (11-13-19, 04:34 AM)
On Tue, 12 Nov 2019 15:55:21 -0500, DFS <nospam> wrote:

[..]
> int start = nums[0];
> int end = nums[lenNums-1];
> char *interval = "";


interval points to a string literal that consists of the single
character \0.

> char *intervals[sizeof(char*) * lenNums];


Not many compilers support VLAs.

Most would allocate a dynamic array. Some would define an array with
an arbitrary size based on the maximum number of elements expected.

> //check for no open intervals
> if((end-start)==(lenNums-1)) {
> printf("Number set is sequential from %d to %d\n",start,end);
> exit(0);
> }
> //find open intervals
> x = 0;
> for(i=0;i<lenNums;i++) {


What is this fetish you have for eye tests?

> if(nums[i+1]-nums[i] > 1) {
> start = nums[i]+1;
> if(nums[i+1]-start >= 1) {
> end = nums[i+1]-1;
> //end of algorithm
> sprintf(interval,"(%d,%d)",start,end);


interval still points to the string literal. You are not allowed to
modify a string literal. This causes undefined behavior.

> intervals[x] = interval; //assignment seems to work


Bar addressed this.

> printf("%s ",intervals[x]);
> x++;
> }
> }
> }
>printf("\n\n");
>//Why do the following not work, given that the
>//assignment and printf() in the loop above seems to?


They do work; they just don't do what you expect.

Run the code under a debugger and look at the values in intervals[i].
[..]
Keith Thompson (11-13-19, 05:01 AM)
Barry Schwarz <schwarzb> writes:
[...]
> Not many compilers support VLAs. [...]


Is that true? Visual Studio doesn't, but in my experience most
(other?) modern C compilers do.
DFS (11-13-19, 06:28 AM)
On 11/12/2019 4:20 PM, Bart wrote:
> On 12/11/2019 20:55, DFS wrote:
> (If already sorted, then this task is fairly trivial, just a bit fiddly.
> For example in my script language:
>     proc start=
>         nbrs  := [1, 2, 4, 5, 7, 12, 1300, 10000]
>         println -nbrs         # or inot nbrs
>     end
> Output is:
>     [0,3,6,8..11,13..1299,1301..9999]


Your language already has builtin functions to find gaps in a sequence?
Sweet. Or did you just now write it?

> Which is most of it done (needs to properly define the lower bound, and
> format output as desired). But the approach here involves bita-rrays,
> which can use lots of memory for big numbers.)



Is there a faster/better way (you mentioned bit-arrays - what does that
code look like? Seems complicated. Why would you use such a thing to
find these intervals?

>>    char *intervals[sizeof(char*) * lenNums];

> Use char ** here; see below.


That created a couple warnings:

getintervals.c:38:22: warning: assignment to ‘char **’ from incompatible
pointer type ‘char *’ [-Wincompatible-pointer-types]
38 | intervals[x] = str;
| ^
getintervals.c:46:12: warning: format ‘%s’ expects argument of type
‘char *’, but argument 2 has type ‘char **’ [-Wformat=]
46 | printf("%s ",intervals[i]);
| ~^ ~~~~~~~~~~~~
| | |
| char * char **

>>          sprintf(interval,"(%d,%d)",start,end);
>>          intervals[x] = interval;    //assignment seems to work

> Same problem as a week or two ago: all intervals[x] point to the same
> interval string, ie. the last one when done.


> I used this helper function
>     char* allocstring(char* s){
>        char *t=malloc(strlen(s)+1);
>        strcpy(t,s);
>        return t;
>     }
> then the assignment becomes:
>         intervals[x] = allocstring(interval);


OK. But outside fn() not strictly needed, so I did it in the loop, like so:

char interval[30]; //set size to accomodate
large range
char *intervals[sizeof(char*) * lenNums]; //no pointer to pointer

loop
sprintf(interval,"[%d,%d]",start,end);
char *str=malloc(strlen(interval)+1);
strcpy(str,interval);
intervals[x] = str;

It compiles and works - please tell me there's nothing wrong there.

I understand there "is no string type in C", but all that work to assign
an array of chars to an array position?

Since I already sized the array up front and set interval to the correct
value...

char *intervals[sizeof(char*) * lenNums];
sprintf(interval,"[%d,%d]",start,end);

.... what's the use of having to create the temp/working variable,
copying what's in interval to temp, then assigning the temp to the
array? The space is already allocated and the string is correctly set.

Either one of these /should/ work, but somebody's cruel:

sprintf(interval,"[%d,%d]",start,end);
strcpy(intervals[x],interval);

or

sprintf(interval,"[%d,%d]",start,end);
intervals[x] = interval;

Thanks for your help, man! At some point I'll build up C-memory and
stop asking easy questions.
DFS (11-13-19, 06:57 AM)
On 11/12/2019 6:30 PM, Ben Bacarisse wrote:
> DFS <nospam> writes:
> Not really. The open interval (3, 3) contains no integers where as the
> closed interval [3, 3] contains exactly one integer: 3.


OK. 'gaps of closed intervals in a bounded, monotonically increasing,
integer sequence' it is.

[..]
> putchar('\n');
> }
> }


If you print the whole code I'll run it.
David Brown (11-13-19, 11:34 AM)
On 13/11/2019 05:57, DFS wrote:
> On 11/12/2019 6:30 PM, Ben Bacarisse wrote:
> OK.  'gaps of closed intervals in a bounded, monotonically increasing,
> integer sequence' it is.
> If you print the whole code I'll run it.


That was almost the whole code. You need to add:

#include <stdio.h>
#include <stdlib.h>

at the start, then Ben's print_gaps and main functions. That is it.

If you call the program "gaps", then you run it with:

../gaps 1 2 4 5 7 12 1300 10000
David Brown (11-13-19, 11:52 AM)
On 13/11/2019 05:28, DFS wrote:

> I understand there "is no string type in C", but all that work to assign
> an array of chars to an array position?


The main point to understand is that there is no automatic memory
management in C. If you need to store things with different sizes, you
have to allocate memory for them along the way - using malloc or related
functions. And you have to free the memory after use. (In most
systems, once the program stops all allocated memory will be freed by
the OS - but you should get in the habit of doing it correctly in the
program.)

You are learning C. That means you need to understand these things, and
practice them. You need to get the hang of when you need to allocate
memory and when to free it, and when you can pretend that a "char *" is
a bit like a string in the sense used by other languages, and when it is
not.

You need to build up a collection of functions that you have made (or
acquired) and understand, and can re-use. So take Bart's "allocstring"
function and keep it (though I recommend being more generous with spaces
than Bart is - make your code as readable as possible). Don't inline it
into the rest of your code. That is the compiler's job.

And throw out your toy compilers. Use a /good/ compiler - clang or gcc.
Use its features well. Use its warnings, and its optimisations (which
enable better warnings). Use the sanitizers - look these up in the
reference manual for the compiler. These will often tell you when you
are doing something bad, such as memory leaks, overflows, bad
conversions, etc. They can be a big help in learning the language and
good programming habits.

If you are not trying to learn C specifically, and just want to learn
some programming to solve these kinds of programs, then C is definitely
the wrong choice of language. You then want a language that handles
strings, memory allocation, sorting, lists, etc., at a higher level.
For example, a Python solution for your program here can be written in a
single line function :

def gaps(ns) :
[(s + 1, t - 1) for (s, t) in zip(ns[:-1], ns[1:]) if (t - s) > 1]

gaps([1, 2, 4, 5, 7, 12, 1300, 10000])
David Brown (11-13-19, 11:57 AM)
On 13/11/2019 04:01, Keith Thompson wrote:
> Barry Schwarz <schwarzb> writes:
> [...]
>> Not many compilers support VLAs.

> [...]
> Is that true? Visual Studio doesn't, but in my experience most
> (other?) modern C compilers do.


I thought MSVC was moving, or had moved, to clang for C compiling?
Someone who uses MS tools could tell us.

C99 is 20 years old. A C compiler can't call itself "modern" if it
doesn't support VLA's.

Even tcc supports VLA's quite happily.
Ian Collins (11-13-19, 12:23 PM)
On 13/11/2019 22:57, David Brown wrote:
> On 13/11/2019 04:01, Keith Thompson wrote:
> I thought MSVC was moving, or had moved, to clang for C compiling?
> Someone who uses MS tools could tell us.


The clang++ front end is optional. Mind you, it is for C++ which
doesn't have VLAs.

> C99 is 20 years old. A C compiler can't call itself "modern" if it
> doesn't support VLA's.
> Even tcc supports VLA's quite happily.


In "modern" C compilers, VLAs are optional :)

Microsoft were hist ahead of their time...
Bart (11-13-19, 01:18 PM)
On 13/11/2019 04:28, DFS wrote:
> On 11/12/2019 4:20 PM, Bart wrote:
> Your language already has builtin functions to find gaps in a sequence?
> Sweet.  Or did you just now write it?


No, it just uses bit sets (you can try this code using that 'qq'
program). If I use this smaller example:

[1, 2, 4, 5, 7, 12]

Then in that language that represents an array of bits like this (always
starts from element 0, and with index shown above):

0 1 2 3 4 5 6 7 8 9 10 11 12
(0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1)

Then the bit sequence can be inverted (like using C's "~" on the bits in
an int), to get:

0 1 2 3 4 5 6 7 8 9 10 11 12
(1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0)

If you now read out the index values of the 1s, you get:

[0, 3, 6, 8, 9, 10, 11]

which is normally printed as (0,3,6,8..11). This is not quite what you
want, because yours uses an arbitrary lower bound, not zero, and you
want all sequences to be displayed as (a,b).

But, the scanning involved to do that is little different from doing it
on the original data, so there is little point in doing this in C. It
would also lot more memory for sequences like [1,2, 10 000000, 10001000]
(ie. about 1.25MB here).

The advantage of this kind of representation is that if S is your
original set, then 'N in S' returns 0 when N is within one of your
'open' intervals (or 1 using 'N in -S'), in O(1) time. But it would need
a lower bound adjustment.

> Is there a faster/better way (you mentioned bit-arrays - what does that
> code look like?  Seems complicated.  Why would you use such a thing to
> find these intervals?


See above. The answer depends on what you plan on doing with the
results. But given data like (1, 2, 4, 5, 7, 12, 1300, 10000) you will
always have to start by 'visiting' each element anyway. To just build a
list of gaps, that might be all that's needed.

[..]
>       |           ~^   ~~~~~~~~~~~~
>       |            |            |
>       |            char *       char **


What options did you use? I go no warnings from 'gcc -Wall -Wpedantic
-Wextra' on the C code here:



[..]
>  strcpy(str,interval);
>  intervals[x] = str;
> It compiles and works - please tell me there's nothing wrong there.


That's fine. But you might need this often enough that having it in a
function is convenient. It also offers only one opportunity for typos
instead of each time you write it. You don't need that 'str' variable in
the main code. You can choose to put in a malloc failure check which
would be too messy in the main code. It doesn't clutter up your main code...

[..]
> or
> sprintf(interval,"[%d,%d]",start,end);
> intervals[x] = interval;


Use intervals[x]=allocstring(interval) as in my version, and you pretty
much have that!
Bart (11-13-19, 01:53 PM)
On 13/11/2019 09:57, David Brown wrote:
> On 13/11/2019 04:01, Keith Thompson wrote:
> I thought MSVC was moving, or had moved, to clang for C compiling?
> Someone who uses MS tools could tell us.
> C99 is 20 years old. A C compiler can't call itself "modern" if it
> doesn't support VLA's.
> Even tcc supports VLA's quite happily.


I don't support VLAs. For one because I don't agree with the feature
(imagine using a function like malloc for a memory pool 0.1% of the size
of what is available to malloc, but with no way to check the result).

But also because it's quite complex to implement. Even though most uses
of it are straightforward, you have to deal with all the fiddly bits
that C allows (where a VLA is defined deep inside a type; the expression
that gives the size changes value, but sizeof() has to return the
original; and the VLA (or a dozen VLAs with 2-3 dimensions each) is
repeatedly allocated and deallocated inside a nested block and inside a
loop, with gotos all over the place).

(I know you think is it straightforward, when it was discussed in depth
a few years ago on the 'std' group. But then you seem to think
everything is.)
David Brown (11-13-19, 03:02 PM)
On 13/11/2019 12:53, Bart wrote:
> On 13/11/2019 09:57, David Brown wrote:
> I don't support VLAs. For one because I don't agree with the feature
> (imagine using a function like malloc for a memory pool 0.1% of the size
> of what is available to malloc, but with no way to check the result).


And with 0.1% of the run-time cost of malloc...

(And rounded to the nearest percent, no one checks the success or
failure of malloc and sensibly handles failure smoothly. In almost any
software on a PC, a malloc failure means a program crash with a more or
less unfriendly error. A VLA allocation failure will give the same result.)

I know you don't support VLAs. But you don't claim your compiler is a
full-featured modern C compiler. It's a specialised compiler handling
the subset of C you want it to handle, and that's fair enough.

> But also because it's quite complex to implement. Even though most uses
> of it are straightforward, you have to deal with all the fiddly bits
> that C allows (where a VLA is defined deep inside a type; the expression
> that gives the size changes value, but sizeof() has to return the
> original; and the VLA (or a dozen VLAs with 2-3 dimensions each) is
> repeatedly allocated and deallocated inside a nested block and inside a
> loop, with gotos all over the place).
> (I know you think is it straightforward, when it was discussed in depth
> a few years ago on the 'std' group. But then you seem to think
> everything is.)


I don't think everything is straightforward, as I have told you
countless times. But implementing VLA's would be a lot simpler and
easier than many of the things you have already implemented.

I you think implementing VLA's is a problem to implement, I think you
are either mistaken or lying. I gave you a lot of ideas and suggestions
on how to implement them in that previous conversation. I simply do not
believe you are smart enough to have implemented the compiler you have,
but not smart enough to get VLA's working in a simple manner. If you
actively /choose/ not to implement VLA's, because you don't like them,
don't want them, or can't be bothered, then that is absolutely fine.
But stop making excuses about it being "hard" - you could do it fine if
you wanted to.
Thiago Adams (11-13-19, 03:21 PM)
On Wednesday, November 13, 2019 at 8:53:19 AM UTC-3, Bart wrote:
> On 13/11/2019 09:57, David Brown wrote:


It can be used optionally with clang as front end.
(The MS C compiler now supports compound literals at least.)

[..]
> (I know you think is it straightforward, when it was discussed in depth
> a few years ago on the 'std' group. But then you seem to think
> everything is.)


If VLA have some fallback for malloc then the C compiler needs
some internal destructor (like C++).

In this case, other features could be added with the same 'cost' for the implementers like adding defer.

Apart of that we also have the behavior of sizeof of and other constrains
that only VLAs impose.